Lipid metabolism transcription factor

ABSTRACT

The invention provides a mammalian nucleic acid and fragments thereof. It also provides for the use of these nucleic acids in a model system for the characterization, diagnosis, evaluation, treatment of conditions, diseases and disorders associated with expression of the mammalian nucleic acid. The invention additionally provides expression vectors and host cells for the production of the protein encoded by the mammalian nucleic acid.

[0001] This application is a contiunation of U.S. Ser. No. 09/709,976,filed Nov. 10, 2000, which is a divisional of U.S. Ser. No. 09/318,978filed May 26, 1999.

FIELD OF THE INVENTION

[0002] This invention relates to nucleic acid and amino acid sequencesof a new mammalian protein and to the use of these sequences in thecharacterization, diagnosis, and treatment of cell proliferative andlipid disorders.

BACKGROUND OF THE INVENTION

[0003] Phylogenetic relationships among organisms have been demonstratedmany times, and studies from a diversity of prokaryotic and eukaryoticorganisms suggest a more or less gradual evolution of biochemical andphysiological mechanisms and metabolic pathways. Despite differentevolutionary pressures, proteins that regulate the cell cycle in yeast,nematode, fly, rat, and man have common chemical or structural featuresand modulate the same general cellular activity. Comparisons of humangene sequences with those from other organisms where the structureand/or function may be known allow researchers to draw analogies and todevelop model systems for testing hypotheses. These model systems are ofgreat importance in developing and testing diagnostic and therapeuticagents for human conditions, diseases and disorders.

[0004] Fatty acids are required for phospholipid, glycolipid, hormone,and intracellular messenger formation; to anchor proteins to membranes;and as fuel molecules. Most cells can synthesize fatty acids fromacetate substrates, though many mammalian cells also obtain fatty acidsby hydrolysis of triglycerides. Synthesis of phospholipids primarilyoccurs on the surface of the smooth endoplasmic reticulum. Although mostcells constitutively form fatty acids, the level of synthesis varieswith the needs of the cell. During phases of rapid cell division,membrane formation requires enhanced production of phospholipids.Animals that have fasted and are then fed high-carbohydrate, low-fatdiets show marked increases in the amount and activity of enzymesresponsible for fatty acid synthesis. Increased synthesis of long chainfatty acids also occurs in multiple common neoplasms, including thosearising in the breast, prostate, ovary, colon, and endometrium.Overexpression of fatty acid synthase (FAS), a major enzyme of fattyacid biosynthesis, is a marker for poor prognosis in breast tumors andhas been shown to be important for tumor growth (Moncur et al. (1997)Proc Natl Acad Sci 95:6989-6994).

[0005] The transcriptional regulation of enzymes involved in fatty acidsynthesis is associated with Spot 14 (S14) protein. S14 is a small,acidic nuclear protein with a carboxyterminal “zipper” domain involvedin homodimer formation. It is expressed in tissues that produce lipidsfor use as metabolic fuels, such as lactating mammary tissue, white andbrown adipose tissue, and liver. The expression of S14 is increased inresponse to insulin, dietary carbohydrates, glucose, and thyroid hormoneand reduced in response to glucagon, fasting, and in diabetes mellitus.Expression of antisense oligonucleotides has shown S14 inducestissue-specific expression of several lipogenic enzymes including FASand ATP citrate lyase. The S14 gene is located on chromosome 11 atposition q13.5, a chromosomal region amplified in approximately 20% ofbreast cancers, and is expressed in several breast cancer-derived celllines and in a majority of primary breast tumors (Cunningham et al.(1998) Thyroid 8:815-825; Liaw and Towle (1984) J Biol Chem259:7253-7260; Brown et al. (1997) J Biol Chem 272:2163-2166; andMoncur, supra).

[0006] A zebrafish gastrulation protein, G12, shares features with S14including acidic pI (˜4.9) and nearly identical size (˜17 kDa). Thesequence similarity between the two proteins is strongest at thecarboxyterminus, including the zipper domain. G12 is expressed in anouter, enveloping layer of cells (EVL), analogous to the mammaliantrophectoderm, during a period in gastrulation in which the EVL layerexpands to cover the developing, embryonic deep cell layer. During thisstage, apical membrane turnover in the EVL increases and raises therequirement for phospholipids used in plasma membranes (Conway (1995)Mech Dev 52:383-391; Fink and Cooper (1996) Dev Biol 174:180-189).

[0007] The discovery of a polynucleotide encoding a new mammalianprotein satisfies a need in the art by providing new compositions whichare useful in the characterization, diagnosis, and treatment of cellproliferative and lipid disorders.

SUMMARY OF THE INVENTION

[0008] The invention is based on the discovery of a polynucleotideencoding a mammalian protein, lipid metabolism transcription factor(LMTF), which satisfies a need in the art by providing new compositionsuseful in the characterization, diagnosis, and treatment of cellproliferative and lipid disorders.

[0009] The invention provides an isolated and purified mammalianpolynucleotide comprising the nucleic acid sequence of SEQ ID NO:1 or afragment thereof. The invention also provides fragments homologous tothe mammalian polynucleotide from rat, mouse, and monkey.

[0010] The invention further provides an isolated and purifiedpolynucleotide or a fragment thereof which hybridizes under highstringency conditions to the polynucleotide of SEQ ID NO:1. Theinvention also provides an isolated and purified polynucleotide which iscomplementary to the polynucleotide of SEQ ID NO:1. In one aspect, asingle stranded complementary RNA or DNA molecule is used as a probewhich hybridizes under high stringency conditions to the mammalianpolynucleotide or a fragment thereof.

[0011] The invention further provides a method for detecting apolynucleotide in a sample containing nucleic acids, the methodcomprising the steps of: (a) hybridizing a probe to at least one of thenucleic acids of the sample, thereby forming a hybridization complex;and (b) detecting the hybridization complex, wherein the presence of thehybridization complex correlates with the presence of a polynucleotidein the sample. In one aspect, the method further comprises amplifyingthe polynucleotide prior to hybridization. The polynucleotide orfragment thereof may comprise an element or target on a microarray. Theinvention also provides a method for screening a library of moleculesfor specific binding to a polynucleotide or a fragment thereof, themethod comprising providing a library of molecules, combining thepolynucleotide of claim 1 with a plurality of molecules under conditionswhich allow specific binding, and detecting binding of thepolynucleotide to each of a plurality of molecules, thereby identifyingat least one molecule which specifically binds the polynucleotide. Suchmolecules are potential regulators of polynucleotide function.

[0012] The invention also provides an expression vector containing atleast a fragment of the polynucleotide of SEQ ID NO:1. In anotheraspect, the expression vector is contained within a host cell. Theinvention further provides a method for producing a protein, the methodcomprising the steps of culturing the host cell for expression of theprotein and recovering the protein from the host cell culture. Theinvention also provides an isolated and purified protein comprising theamino acid sequence of SEQ ID NO:2 or a portion thereof. Additionally,the invention provides a composition comprising a purified proteinhaving the sequence of SEQ ID NO:2 or a portion thereof in conjunctionwith a pharmaceutical carrier.

[0013] The invention further provides a method for using a portion ofthe protein to produce antibodies. The invention also provides a methodfor using a protein or a portion thereof to screen for molecules whichspecifically bind the protein, the method comprising the steps ofcombining the protein or a portion thereof with a library of moleculesunder conditions which allow complex formation and detecting complexformation, wherein the presence of the complex identifies a moleculewhich specifically binds the protein. In one aspect, a moleculeidentified using the method increases the activity of the protein. Inanother aspect, a molecule identified using the method decreases theactivity of the protein.

BRIEF DESCRIPTION OF THE FIGURES AND TABLE

[0014]FIGS. 1A, 1B, 1C, 1D, 1E, and 1F show the nucleic acid sequence(SEQ ID NO:1) encoding the amino acid sequence (SEQ ID NO:2) of themammalian protein. The alignment was produced using MACDNASIS PROsoftware (Hitachi Software Engineering, South San Francisco Calif.).

[0015]FIGS. 2A and 2B show the chemical and structural similaritybetween SEQ ID NO:2, G12 (GI 861207; SEQ ID NO:26) and Spot14 (GI1171574; SEQ ID NO:27), produced using the multisequence alignmentprogram of LASERGENE software (DNASTAR, Madison Wis.). The amino acidsof SEQ ID NO:2, from residue 45 to residue 59, may be used for antibodyproduction.

[0016] Table 1 shows the ESTs from human, rat, mouse, and monkey whichhave homology with SEQ ID NO:1 and includes their nucleotide length,biological source, region of overlap with SEQ ID NO:1, and percentidentity with SEQ ID NO:1.

DESCRIPTION OF THE INVENTION

[0017] It is understood that this invention is not limited to theparticular machines, materials and methods described. It is also to beunderstood that the terminology used herein is for the purpose ofdescribing particular embodiments only and is not intended to limit thescope of the present invention which will be limited only by theappended claims. As used herein, the singular forms “a”, “an”, and “the”include plural reference unless the context clearly dictates otherwise.For example, a reference to “a host cell” includes a plurality of suchhost cells known to those skilled in the art.

[0018] Unless defined otherwise, all technical and scientific terms usedherein have the same meanings as commnonly understood by one of ordinaryskill in the art to which this invention belongs. All publicationsmentioned herein are cited for the purpose of describing and disclosingthe cell lines, protocols, reagents and vectors which are reported inthe publications and which might be used in connection with theinvention. Nothing herein is to be construed as an admission that theinvention is not entitled to antedate such disclosure by virtue of priorinvention.

[0019] Definitions

[0020] “LMTF” refers to a purified protein, lipid metabolismtranscription factor, obtained from any mammalian species, includingmurine, bovine, ovine, porcine, simian, and preferably the humanspecies, and from any source, whether natural, synthetic,semi-synthetic, or recombinant.

[0021] “Agents, molecules, or compounds” are used interchangably andrefer to that which interacts with, specifically binds to, or modifiesthe expression of the polynucleotides and proteins of the invention; andmay be composed of at least one of the following: nucleic acids,proteins, carbohydrates, fats, lipids, organic and inorganic substances.

[0022] “Biologically active” refers to a protein having structural,immunological, regulatory, or chemical functions of a naturallyoccurring, recombinant or synthetic molecule.

[0023] “Complementary” refer to the natural base pairing by hydrogenbonding between purines and pyrimidines. For example, the sequenceA-C-G-T forms hydrogen bonds with its complement T-G-C-A or U-G-C-A. Twosingle-stranded molecules may be considered partially complementary, ifonly some of the nucleotides bond, or completely complementary, ifnearly all of the nucleotides bond. The degree of complementaritybetween nucleic acid strands affects the efficiency and strength of thehybridization and amplification reactions.

[0024] “Derivative” refers to the chemical modification of apolynucleotide or protein sequence. Chemical modifications of a sequencecan include replacement of hydrogen by an alkyl, acyl, or amino group orglycosylation, pegylation, or any similar process which retains orenhances biological activity or lifespan of the molecule.

[0025] “Fragment” refers to an Incyte clone or any part of apolynucleotide which retains a usable, functional characteristic. Usefulfragments include oligonucleotides which may be used in hybridization oramplification technologies or in regulation of replication,transcription or translation.

[0026] “Hybridization complex” refers to a complex between two nucleicacid sequences by virtue of the formation of hydrogen bonds betweenpurines and pyrimidines.

[0027] “Polynucleotide” refers to a nucleic acid molecule,oligonucleotide, or any fragment thereof. It may be DNA or RNA ofgenomic or synthetic origin, double-stranded or single-stranded, andcombined with carbohydrate, lipids, protein or other materials toperform a particular activity such as transformation or form a usefulcomposition such as a peptide nucleic acid (PNA). “Oligonucleotide” isequivalent to the terms amplimer, primer, oligomer, element, target, andprobe and is preferably single stranded.

[0028] “Protein” refers to an oligopeptide, peptide, or polypeptide orportions thereof whether naturally occurring or synthetic.

[0029] “Portion”, as used herein, refers to any part of a protein usedfor any purpose, but especially for the screening of molecules orcompounds which specifically bind to that part or for the production ofantibodies.

[0030] “Sample” is used in its broadest sense. A sample containingnucleic acids may comprise a bodily fluid; an extract from a cell,chromosome, organelle, or membrane isolated from a cell; genomic DNA,RNA, or cDNA in solution or bound to a substrate; a cell; a tissue; atissue print; and the like.

[0031] Molecules or compounds which “specifically bind” the mammalianpolynucleotide or protein may include, nucleic acids, carbohydrates,lipids, proteins, or any other organic or inorganic molecules or theircombinations which stabilize, increase, or decrease the activity of themammalian polynucleotide or protein. “Purified” refers to nucleic acidor amino acid sequences that are removed from their natural environmentand are isolated or separated, and are at least about 60% free,preferably about 75% free, and most preferably about 90% free, fromother components with which they are naturally associated.

[0032] “Substrate” refers to any rigid or semi-rigid support to whichpolynucleotides or proteins are bound and includes membranes, filters,chips, sides, wafers, fibers, magnetic or nonmagnetic beads, gels,capillaries or other tubing, plates, polymers, and microparticles with avariety of surface forms including wells, trenches, pins, channels andpores.

THE INVENTION

[0033] The invention is based on the discovery of a new mammalianpolynucleotide which encodes a mammalian protein, lipid metabolismtranscription factor, and the use of the nucleic acid sequence, orfragments thereof, and amino acid sequences, or portions thereof, ascompositions in the characterization, diagnosis, or treatment, of cellproliferative and lipid disorders.

[0034] Nucleic acids encoding the mammalian protein of the presentinvention were identified by BLAST using Incyte clone 700145292H1 whichwas differentially expressed in male rat reproductive tissue. Aconsensus sequence, SEQ ID NO:1, was assembled from the followingoverlapping and/or extended nucleic acid fragments found in IncyteClones 1479946F6, 3241390F6, 1432520R1, 4534217H1, 2191992H1, 1320132T1,1516707T1, 5595953H1, and 1988906R6; SEQ ID NOs:3-11, respectively.FIGS. 1A-1F show the concensus sequence and translation of SEQ ID NO:1.

[0035] In one embodiment, the protein comprising the amino acid sequenceof SEQ ID NO:2, LMTF, is 183 amino acids in length and has one potentialN-glycosylation site at residue N77; three potential protein kinase Cphosphorylation sites at residues S96, T162, and T169; and a potentialleucine zipper motif from residue L154 through L168. As shown in FIGS.2A and 2B, the protein has chemical and structural similarity withzebrafish G12 (GI 861207; SEQ ID NO:26) and mouse S14 (GI 1171574; SEQID NO:27). In particular, LMTF shares 48% identity with G12 protein and32% identity with S14. LMTF, G12, and S14 are similar in size (20 kDa,17.5 kDa, and 17 kDa, respectively) and isoelectric point (5.3, 5.0, and4.8, respectively), as calculated using LASERGENE software (DNASTAR).Furthermore, LMTF, G12, and S14 share conserved leucine residuescomprising a zipper motif at residues L154, L161, and L168 in LMTF.

[0036] Table 1 shows the nucleic acid fragments from human, rat, mouse,and monkey and their sequence coverage and identity with SEQ ID NO:1.Columns 1 and 2 list the SEQ ID NO and Incyte clone number,respectively, for each nucleic acid fragment. The fragments of SEQ IDNO:1, SEQ ID NOs:3-11, are useful in hybridization or amplificationtechnologies to identify and distinguish between the mammalian moleculesdisclosed herein and similar sequences including SEQ ID NOs:12-25.Column 3 lists the nucleotide length for each fragment. Columns 4 and 5identify the source organism and Incyte cDNA library from which thefragments were isolated, respectively. Column 6 identifies the range ofnucleotide residues in SEQ ID NO:1 over which each fragment showsidentity. Column 7 shows the percent sequence identity between eachfragment and SEQ ID NO:1 over the nucleotides set forth in column 6.

[0037] Northern analysis shows the expression of LMTF in variouslibraries, particularly in nervous tissues of human, rat, and monkey. Ofparticular note is the expression of LMTF in conditions associated withcell proliferation, such as cancer and inflammation.

[0038] The mammalian fragments comprising SEQ ID NO:12-13 from monkey,SEQ ID NO:14-15 from mouse, and SEQ ID NO:16-25 from rat were identifiedusing either SEQ ID NO:1 or SEQ ID NOs:3-11. These fragments may be usedto obtain the full length sequence for a particular species which inturn can be used to produce transgenic animals which mimic humandiseases. The fragments are useful in hybridization and amplicationtechnologies to monitor animal toxicological studies, clinical trials,and subject/patient treatment profiles through time.

[0039] Characterization and Use of the Invention

[0040] In a particular embodiment disclosed herein, mRNA was isolatedfrom mammalian cells and tissues using methods which are well known tothose skilled in the art and used to prepare the cDNA libraries. TheIncyte clones listed above were isolated from mammalian cDNA libraries.At least one library preparation representative of the invention isdescribed in the EXAMPLES below. The consensus mammalian sequence waschemically and/or electronically assembled from fragments includingIncyte clones, extension, and/or shotgun sequences using computerprograms such as the AUTOASSEMBLER application (Applied Biosystems,Foster City Calif.).

[0041] Methods for sequencing nucleic acids are well known in the artand may be used to practice any of the embodiments of the invention. Themethods may employ such enzymes as the Klenow fragment of DNA polymeraseI, T7 SEQUENASE DNA polymerase, Taq DNA polymerase, and THERMOSEQUENASEDNA polymerase (Amersham Pharmacia Biotech (APB), Picataway N.J.) orcombinations of polymerases and proofreading exonucleases such as thosefound in the ELONGASE amplification system (Life Technologies, RockvilleMd.). Preferably, sequence preparation is automated with machines suchas the HYDRA microdispenser (Robbins Scientific, Sunnyvale Calif.),MICROLAB 2200 system (Hamilton, Reno Nev.), and the DNA ENGINE thermalcycler (MJ Research, Watertown Mass.). Machines used for sequencinginclude the ABI 3700, 377 or 373 DNA sequencing systems (AppliedBiosystems), the MEGABACE 1000 DNA sequencing system (APB), and thelike. The sequences may be analyzed using a variety of algorithms whichare well known in the art and described in Ausubel (1997; ShortProtocols in Molecular Biology, John Wiley & Sons, New York N.Y., unit7.7) and in Meyers (1995; Molecular Biology and Biotechnology, WileyVCH, New York N.Y., pp. 856-853).

[0042] Shotgun sequencing is used to generate more sequence from clonedinserts derived from multiple sources. Shotgun sequencing methods arewell known in the art and use thermostable DNA polymerases, heat-labileDNA polymerases, and primers chosen from representative of regionsflanking the nucleic acid sequences of interest. Prefinished sequences(incomplete assembled sequences) are inspected for identity usingvarious algorithms or programs well known in the art (Gordon (1998)Genome Res 8:195-202). Contaminating sequences including vector orchimeric sequences or deleted sequences can be removed or restored,respectively, organizing the prefinished sequences into finishedsequences.

[0043] The sequences of the invention may be extended using variousPCR-based methods known in the art. For example, the XL-PCR kit (AppliedBiosystems), nested primers, and commercially available EDNA or genomicDNA libraries (Life Technologies; Clontech, Palo Alto Calif.,respectively) may be used to extend the nucleotide sequence. For allPCR-based methods, primers may be designed using commercially availablesoftware, such as OLIGO software (Molecular Biology Insights, CascadeColo.) to be about 22 to 30 nucleotides in length, to have a GC contentof about 50% or more, and to anneal to a target sequence at temperaturesof about 68° C. to 72° C. When extending a sequence to recoverregulatory elements, it is preferable to use genomic, rather than cDNAlibraries.

[0044] The polynucleotide sequence of SEQ ID NO:1 and fragments thereofcan be used in various hybridization technologies for various purposes.Hybridization probes may be designed or derived from SEQ ID NO:1. Suchprobes maybe made from a highly specific region such as the 5′regulatory region or from a conserved motif, and used in protocols toidentify naturally occurring sequences encoding the mammalian protein,allelic variants, or related sequences, and should preferably have atleast 50% sequence identity to any of the protein sequences. Thehybridization probes of the subject invention may be DNA or RNA andmaybe derived from the sequence of SEQ ID NO:1 or from genomic sequencesincluding promoters, enhancers, and introns of the mammalian gene.Hybridization or PCR probes may be produced using oligolabeling, nicktranslation, end-labeling, or PCR amplification in the presence of thelabeled nucleotide. A vector containing the nucleic acid sequence may beused to produce an mRNA probe in vitro by addition of an RNA polymeraseand labeled nucleotides. These procedures may be conducted usingcommercially available kits such as those provided by APB.

[0045] The stringency of hybridization is determined by G+C content ofthe probe, salt concentration, and temperature. In particular,stringency can be increased by reducing the concentration of salt orraising the hybridization temperature. In solutions used for somemembrane based hybridizations, additions of an organic solvent such asformamide allows the reaction to occur at a lower temperature.Hybridization can be performed at low stringency with buffers, such as5×SSC with 1% sodium dodecyl sulfate (SDS) at 60° C., which permits theformation of a hybridization complex between nucleotide sequences thatcontain some mismatches. Subsequent washes are performed at higherstringency with buffers such as 0.2×SSC with 0.1% SDS at either 45° C.(medium stringency) or 68° C. (high stringency). At high stringency,hybridization complexes will remain stable only where the nucleic acidsequences are completely complementary. In some membrane-basedhybridizations, preferably 35% or most preferably 50%, formamide can beadded to the hybridization solution to reduce the temperature at whichhybridization is performed, and background signals call be reduced bythe use of other detergents such as Sarkosyl or TRITON X-100(Sigma-Aldrich, St. Louis Mo.) and a blocking agent such as salmon spermDNA. Selection of components and conditions for hybridization are wellknown to those skilled in the art and are reviewed in Ausubel (supra)and Sambrook et al. (1989) Molecular Cloning, A Laboratory Manual, ColdSpring Harbor Press, Plainview N.Y.

[0046] Microarrays may be prepared and analyzed using methods known inthe art. Oligonucleotides may be used as either probes or targets in amicroarray. The microarray can be used to monitor the expression levelof large numbers of genes simultaneously and to identify geneticvariants, mutations, and single nucleotide polymorphisms. Suchinformation may be used to determine gene function; to understand thegenetic basis of a condition, disease, or disorder; to diagnose acondition, disease, or disorder; and to develop and monitor theactivities of therapeutic agents. (See, e.g., Brennan et al. (1995) U.S.Pat. No. 5,474,796; Schena et al. (1996) Proc Natl Acad Sci93:10614-10619; Baldeschweiler et al. (1995) PCT applicationWO95/251116; Shalon et al. (1995) PCT application WO95/35505; Heller etal. (1997) Proc Natl Acad Sci 94:2150-2155; and Heller et al. (1997)U.S. Pat. No. 5,605,662.)

[0047] Hybridization probes are also useful in mapping the naturallyoccurring genomic sequence. The sequences may be mapped to a particularchromosome, to a specific region of a chromosome, or to artificialchromosome constructions, e.g., human artificial chromosomes (HACs),yeast artificial chromosomes (YACs), bacterial artificial chromosomes(BACs), bacterial P1 constructions, or single chromosome DNA libraries.

[0048] A multitude of polynucleotide sequences capable of encoding themammalian protein may be cloned into a vector and used to express theprotein, or portions thereof, in host cells. The nucleotide sequence canbe engineered by such methods as DNA shuffling (Stenmier and Crameri(1996) U.S. Pat. No. 5,830,721) and site-directed mutagenesis to createnew restriction sites, alter glycosylation patterns, change codonpreference to increase expression in a particular host, produce splicevariants, extend half-life, and the like. The expression vector maycontain transcriptional and translational control elements (promoters,enhancers, specific initiation signals, and 3′untranslated regions) fromvarious sources which have been selected for their efficiency in aparticular host. The vector, nucleic acid sequence, and regulatoryelements are combined using in vitro recombinant DNA techniques,synthetic techniques, and/or in vivo genetic recombination techniqueswell known in the art and described in Sambrook (supra, ch. 4, 8, 16 and17).

[0049] A variety of host systems may be transformed with an expressionvector. These include, but are not limited to, bacteria transformed withrecombinant bacteriophage, plasmid, or cosmid DNA expression vectors;yeast transformed with yeast expression vectors; insect cell systemstransformed with baculovirus expression vectors; plant cell systemstransformed with expression vectors containing viral and/or bacterialelements, or animal cell systems (Ausubel supra, unit 16). For example,an adenovirus transcription/translation complex may be utilized inmammalian cells, Sequences may be ligated into the non-essential E1 orE3 region of the viral genome, and the infective virus used to transformand express the protein in host cells. The Rous sarcoma virus enhanceror SV40 or EBV-based vectors may also be used for high-level proteinexpression

[0050] Routine cloning, subcloning, and propagation of polynucleotidesequences can be achieved using the multifunctional PBLUESCRIPT vector(Stratagene, La Jolla Calif.) or PSPORT1 plasmid (Life Technologies).Introduction of a polynucleotide sequence into the multiple cloning siteof these vectors disrupts the lacZ gene and allows colorimetricscreening for transformed bacteria. In addition, these vectors may beuseful for in vitro transcription, dideoxy sequencing, single strandrescue with helper phage, and creation of nested deletions in the clonedsequence.

[0051] For long term production of recombinant proteins, the vector canbe stably transformed into cell lines along with a selectable or visiblemarker gene on the same or on a separate vector. After transformation,cells are allowed to grow for about 1 to 2 days in enriched media andthen are transferred to selective media. Selectable markers,antimetabolite, antibiotic, or herbicide resistance genes, conferresistance to the relevant selective agent and allow growth and recoveryof cells which successfully express the introduced sequences. Resistantclones identified either by survival on selective media or by theexpression of visible markers, such as anthocyanins, green fluorescentprotein (GFP), β glucuronidase, luciferase and the like, may bepropagated using tissue culture techniques. Visible markers are alsoused to quantify the amount of protein expressed by the introducedgenes. Verification that the host cell contains the desired mammalianpolynucleotide is based on DNA-DNA or DNA-RNA hybridizations or PCRamplification techniques.

[0052] The host cell may be chosen for its ability to modify arecombinant protein in a desired fashion. Such modifications includeacetylation, carboxylation, glycosylation, phosphorylation, lipidation,acylation and the like. Post-translational processing which cleaves a“prepro” form may also be used to specify protein targeting, folding,and/or activity. Different host cells which have specific cellularmachinery and characteristic mechanisms for post-translationalactivities (CHO, HEK293, and WVI38; American Type Culture Collection,Manassas Va.) may be chosen to ensure the correct modification andprocessing of the foreign protein.

[0053] Heterologous moieties engineered into a vector for ease ofpurification include glutathione S-transferase (GST), calmodulin bindingpeptide (CBP), 6-His, FLAG, c-myc, and the like. GST, CBP, and 6-His arepurified using commercially available affinity matrices such asimmobilized glutathione, calmodulin, and metal-chelate resins,respectively. FLAG and c-myc are purified using commercially availablemonoclonal and polyclonal antibodies. A proteolytic cleavage site may belocated between the desired protein sequence and the heterologous moietyfor ease of separation following purification. Methods for recombinantprotein expression and purification are discussed in Ausubel (supra,unit 16) and are commercially available.

[0054] Proteins or portions thereof may be produced not only byrecombinant methods, but also by using chemical methods well known inthe art. Solid phase peptide synthesis may be carried out in a batchwiseor continuous flow process which sequentially adds α-amino- and sidechain-protected amino acid residues to an insoluble polymeric supportvia a linker group. A linker group such as methylamine-derivatizedpolyethylene glycol is attached to poly(styrene-co-divinylbenzene) toform the support resin. The amino acid residues are N-α-protected byacid labile Boc (t-butyloxycarbonyl) or base-labile Fmoc(9-fluorenylmethoxycarbonyl). The carboxyl group of the protected aminoacid is coupled to the amine of the linker group to anchor the residueto the solid phase support resin. Trifluorloacetic acid or piperidineare used to remove the protecting group in the case of Boc or Fmoc,respectively. Each additional amino acid is added to the anchoredresidue using a coupling agent or pre-activated amino acid derivative,and the resin is washed. The full length peptide is synthesized bysequential deprotection, coupling of derivitized amino acids, andwashing with dichloromethane and/or N,N-dimethylformamide. The peptideis cleaved between the peptide carboxyterminus and the linker group toyield a peptide acid or amide (Novabiochem 1997/98 Catalog and PeptideSynthesis Handbook, San Diego Calif. pp. S1-S20). Automated synthesismay also be carried out on machines such as the ABI 431A peptidesynthesizer (Applied Biosystems). A protein or portion thereof may bepurified by preparative high performance liquid chromatography and itscomposition confirmed by amino acid analysis or by sequencing (Creighton(1984) Proteins, Structures and Molecular Properties, WH Freeman, NewYork N.Y.).

[0055] Various hosts including goats, rabbits, rats, mice, humans, andothers may be immunized by injection with mammalian protein or anyportion thereof. Adjuvants such as Freund's, mineral gels, and surfaceactive substances such as lysolecithin, pluronic polyols, polyanions,peptides, oil emulsions, keyhole limpet hemacyanlin (KLH), anddinitroplhenol may be used to increase immunological response. Theoligopeptide, peptide, or portion of protein used to induce antibodiesshould consist of at least about five amino acids, more preferably tenamino acids, which are identical to a portion of the natural protein.Oligopeptides may be fused with proteins such as KLH in order to produceantibodies to the chimeric molecule.

[0056] Monoclonal antibodies may be prepared using any technique whichprovides for the production of antibodies by continuous cell lines inculture. These include, but are not limited to, the hybridoma technique,the human B-cell hybridoma technique, and the EBV-hybridoma technique.(See, e.g., Kohler et al. (1975) Nature 256:495-497; Kozbor et al.(1985) J Immunol Methods 81:31-42; Cote et al. (1983) Proc Natl Acad Sci80:2026-2030; and Cole et al. (1984) Mol Cell Biol 62:109-120.)

[0057] Alternatively, techniques described for the production of singlechain antibodies may be adapted, using methods known in the art, toproduce epitope-specific single chain antibodies. Antibody fragmentswhich contain specific binding sites for epitopes of the mammalianprotein may also be generated. For example, such fragments include, butare not limited to, F(ab′)2 fragments produced by pepsin digestion ofthe antibody molecule and Fab fragments generated by reducing thedisulfide bridges of the F(ab′)2 fragments. Alternatively, Fabexpression libraries may be constructed to allow rapid and easyidentification of monoclonal Fab fragments with the desired specificity.(See, e.g., Huse et al. (1989) Science 246:1275-1281.)

[0058] The mammalian protein may be used in screening assays of phagemidor B-lymphocyte immunoglobulin libraries to identify antibodies havingthe desired specificity. Numerous protocols for competitive binding orimmunoassays using either polyclonal or monoclonal antibodies withestablished specificities are well known in the art. Such immunoassaystypically involve the measurement of complex formation between theprotein and its specific antibody. A two-site, monoclonal-basedimmunoassay utilizing monoclonal antibodies reactive to twonon-interfering epitopes is preferred, but a competitive binding assaymay also be employed (Pound (1998) Immunochemical Protocols, HumanaPress, Totowa N.J.).

[0059] A wide variety of labels and conjugation techniques are known bythose skilled in the art and may be used in various nucleic acid, aminoacid, and antibody assays. Synthesis of labeled molecules may beachieved using Promega (Madison Wis.) or APB kits for incorporation of alabeled nucleotide such as ³²P-dCTP, Cy3-dCTP or Cy5-dCTP or amino acidsuch as ³⁵S-methionine (APB). Nucleic acids and amino acids may bedirectly labeled with a variety of substances including fluorescent,chemiluminescent, or chromogenic agents, and the like, by chemicalconjugation to amines, thiols and other groups present in the moleculesusing reagents such as BIODIPY or FITC (Molecular Probes, Eugene Oreg.).

[0060] Diagnostics

[0061] The polynucleotides, fragments, oligonucleotides, complementaryRNA and DNA molecules, and PNAs may be used to detect and quantifyaltered gene expression, absence/presence vs. excess, expression ofmRNAs or to monitor mRNA levels during therapeutic intervention.Condition, diseases or disorders associated with altered expression ofLMTF include, but are not limited to, a cell proliferative disorder suchas actinic keratosis, arteriosclerosis, atherosclerosis, bursitis,cirrhosis, hepatitis, mixed connective tissue disease, myelofibrosis,paroxysmal nocturnal hemoglobinuria, polycythemia vera, psoriasis,primary thrombocythemia, and cancers including adenocarcinoma, leukemia,lymphoma, melanoma, myeloma, sarcoma, teratocarcinoma, and, inparticular, cancers of the adrenal gland, bladder, bone, bone marrow,brain, breast, cervix, gall bladder, ganglia, gastrointestinal tract,heart, kidney, liver, lung, muscle, ovary, pancreas, parathyroid, penis,prostate, salivary glands, skin, spleen, testis, thymus, thyroid, anduterus; and a lipid disorder such as fatty liver, cholestasis, carnitinedeficiency, carnitine palmitoyltransferase deficiency, myoadenylatedeaminase deficiency, hypertriglyceridemia, lipid storage disorders suchFabry's disease, Gaucher's disease, Niemann-Pick's disease,metachromatic leukodystrophy, adrenoleukodystrophy, GM₂ gangliosidosis,and ceroid lipofuscinosis, abetalipoproteinemia, Tangier disease,hyperlipoproteinemia, diabetes mellitus, lipodystrophy, lipomatoses,acute panniculitis, disseminated fat necrosis, adiposis dolorosa, lipoidadrenal hyperplasia, minimal change disease, lipomas,hypercholesterolemia, hypercholesterolemia with hypertriglyceridemia,primary hypoalphalipoproteinemia, hypothyroidism, renal disease, liverdisease, lecithin:cholesterol acyltransferase deficiency,cerebrotendinous xanthomatosis, sitosterolemia, hypocholesterolemia,Tay-Sachs disease, Sandhoff's disease, hyperlipidemia, hyperlipemia,lipid myopathies, and obesity. The diagnostic assay may usehybridization or amplification technology to compare gene expression ina biological sample from a patient to standard samples in order todetect altered gene expression. Qualitative or quantitative methods forthis comparison are well known in the art.

[0062] For example, the nucleotide sequence may be labeled by standardmethods and added to a biological sample from a patient. After anincubation period in which hybridization complexes form, the sample iswashed and the amount of label, or its signal, is quantified andcompared with a standard value. If the amount of label in the patientsample is significantly altered in comparison to the standard value,then the presence of the associated condition, disease or disorder isindicated.

[0063] In order to provide a basis for the diagnosis of a condition,disease or disorder associated with gene expression, a normal orstandard expression profile is established. This may be accomplished bycombining a biological sample taken from normal subjects, either animalor human, with a sequence or a fragment thereof under conditions forhybridization or amplification. Standard hybridization may be quantifiedby comparing the values obtained from normal subjects with values froman experiment in which a known amount of an isolated and purifiedpolynucleotide is used. Standard values obtained in this manner may becompared with values obtained from samples from patients who aresymptomatic for a particular condition, disease, or disorder. Deviationfrom standard values toward those associated with a particular conditionis used to diagnose that condition.

[0064] Such assays may also be used to evaluate the efficacy of aparticular therapeutic treatment regimen in animal studies and inclinical trial or to monitor the treatment of an individual patient.Once the presence of a condition is established and a treatment protocolis initiated, diagnostic assays may be repeated on a regular basis todetermine if the level of expression in the patient begins toapproximate that which is observed in a normal subject. The resultsobtained from successive assays may be used to show the efficacy oftreatment over a period ranging from several days to months.

[0065] Therapeutics

[0066] Chemical and structural similarity, e.g., in the context ofsequences and motifs, exists between regions of the mammalian LMTF,zebrafish G12, and mouse Spot14. In addition, expression is closelyassociated with nervous tissue and appears to play a role in cellproliferative and inflammatory disorders. In the treatment of conditionsassociated with increased expression or activity, it is desirable todecrease expression or protein activity. In the treatment of conditionsassociated with decreased expression or activity, it is desirable toincrease expression or protein activity.

[0067] In one embodiment, the mammalian protein or a portion orderivative thereof may be administered to a subject to treat or preventa condition associated with altered expression or activity of themammalian protein. Examples of such conditions include, but are notlimited to, a cell proliferative disorder such as actinic keratosis,arteriosclerosis, atherosclerosis, bursitis, cirrhosis, hepatitis, mixedconnective tissue disease, myelofibrosis, paroxysmal nocturnalhemoglobinuria, polycythemia vera, psoriasis, primary thrombocythemia,and cancers including adenocarcinoma, leukemia, lymphoma, melanoma,myeloma, sarcoma, teratocarcinoma, and, in particular, cancers of theadrenal gland, bladder, bone, bone marrow, brain, breast, cervix, gallbladder, ganglia, gastrointestinal tract, heart, kidney, liver, lung,muscle, ovary, pancreas, parathyroid, penis, prostate, salivary glands,skin, spleen, testis, thymus, thyroid, and uterus; and a lipid disordersuch as fatty liver, cholestasis, carnitine deficiency, carnitinepalmitoyltransferase deficiency, myoadenylate deaminase deficiency,hypertriglyceridemia, lipid storage disorders such Fabry's disease,Gaucher's disease, Niemann-Pick's disease, metachromatic leukodystrophy,adrenoleukodystrophy, GM₂ gangliosidosis, and ceroid lipofuscinosis,abetalipoproteinemia, Tangier disease, hyperlipoproteinemia, diabetesmellitus, lipodystrophy, lipomatoses, acute panniculitis, disseminatedfat necrosis, adiposis dolorosa, lipoid adrenal hyperplasia, minimalchange disease, lipomas, hypercholesterolemia, hypercholesterolemia withhypertriglyceridemia, primary hypoalphalipoproteinemia, hypothyroidism,renal disease, liver disease, lecithin:cholesterol acyltransferasedeficiency, cerebrotendinous xanthomatosis, sitosterolemia,hypocholesterolemia, Tay-Sachs disease, Sandhoff's disease,hyperlipidemia, hyperlipemia, lipid myopathies, and obesity.

[0068] In another embodiment, a composition comprising the purifiedmammalian protein in conjunction with a pharmaceutical carrier may beadministered to a subject to treat or prevent a condition associatedwith altered expression or activity of the mammalian protein including,but not limited to, those provided above.

[0069] In a further embodiment, an agonist which modulates the activityof the mammalian protein may be administered to a subject to treat orprevent a condition associated with altered expression or activity ofthe protein including, but not limited to, those listed above.

[0070] In an additional embodiment, a vector capable of expressing themammalian protein or a portion or derivative thereof may be administeredto a subject to treat or prevent a condition associated with alteredexpression or activity of protein including, but not limited to, thosedescribed above.

[0071] In yet another embodiment, an antagonist or inhibitor of themammalian protein may be administered to a subject to treat or prevent acondition associated with altered expression or activity of the protein.In one aspect, an antibody which specifically binds the mammalianprotein may be used directly as an antagonist or indirectly as atargeting or delivery mechanism for bringing a pharmaceutical agent tocells or tissue which express the mammalian protein.

[0072] In a still further embodiment, a vector expressing the complementof the polynucleotide encoding the mammalian protein may be administeredto a subject to treat or prevent a condition associated with alteredexpression or activity of the protein including, but not limited to,those described above.

[0073] Any of the nucleic acids, complementary sequences, vectors,proteins, agonists, antagonists, or antibodies of the invention may beadministered in combination with other therapeutic agents. Selection ofthe agents for use in combination therapy may be made by one of ordinaryskill in the art according to conventional pharmaceutical principles. Acombination of therapeutic agents may act synergistically to effecttreatment of a particular condition at a lower dosage of each agent.

[0074] Gene expression may be modified by designing complementarysequences or antisense molecules (DNA, RNA, or PNA) to the control, 5′,or regulatory regions of the gene encoding the mammalian protein.Oligonucleotides designed with reference to the transcription initiationsite are preferred. Similarly, inhibition can be achieved using triplehelix base-pairing which inhibits the binding of polymerases,transcription factors, or regulatory molecules (Gee et al. In: Huber andCarr (1994) Molecular and Immunologic Approaches, Futura Publishing, Mt.Kisco N.Y., pp. 163-177.) A complementary sequence may also be designedto block translation by preventing binding between ribosomes and mRNA.

[0075] Ribozymes, enzymatic RNA molecules, may also be used to catalyzethe specific cleavage of RNA. The mechanism of ribozyme action involvessequence-specific hybridization of the ribozyme molecule tocomplementary target RNA followed by endonucleolytic cleavage at sitessuch as GUA, GUU, and GUC. Once such sites are identified, anoligonucleotide with the same sequence may be evaluated for secondarystructural features which would render the oligonucleotide inoperable.The suitability of candidate targets may also be evaluated by testingtheir hybridization with complementary oligonucleotides usingribonuclease protection assays.

[0076] Complementary nucleic acids and ribozymes of the invention may beprepared via recombinant expression, in vitro or in vivo, or using solidphase phosphoramidite chemical synthesis. In addition, RNA molecules maybe modified to increase intracellular stability and half-life byaddition of flanking sequences at the 5′ and/or 3′ ends of the moleculeor by the use of phosphorothioate or 2′ O-ethyl rather thanphosphodiesterase linkages within the backbone of the molecule.Modification is inherent in the production of PNAs and can be extendedto other nucleic acid molecules. Either the inclusion of nontraditionalbases such as inosine, queosine, and wybutosine, and or the modificationof adenine, cytidine, guanine, thymine, and uridine with acetyl-,methyl-, thio- groups renders the molecule less available to endogenousendonucleases.

[0077] The nucleic acid sequence encoding the mammalian protein may beused to screen a library of molecules for specific binding affinity. Theassay can be used to screen a library of DNA molecules, RNA molecules,PNAs, peptides, or proteins including transcription factors, enhancers,repressors, and the like which regulate the activity of the nucleic acidsequence in the biological system. The assay involves providing alibrary of molecules, combining the mammalian nucleic acid sequence or afragment thereof with the library of molecules under conditions to allowspecific binding, and detecting specific binding to identify at leastone molecule which specifically binds the nucleic acid sequence.

[0078] Similarly the mammalian protein or a portion thereof may be usedto screen libraries of molecules in any of a variety of screeningassays. The portion of the protein employed in such screening may befree in solution, affixed to an abiotic or biotic substrate (e.g. borneon a cell surface), or located intracellularly. Specific binding betweenthe protein and molecule may be measured. Depending oil the kind oflibrary being screened, the assay may be used to identify DNA, RNA, orPNA molecules, agonists, antagonists, antibodies, immunoglobulins,inhibitors, peptides, proteins, drugs and the like, which specificallybind the protein. One method for high throughput screening using verysmall assay volumes and very small amounts of test compound is describedin U.S. Pat. No. 5,876,946, which screens large numbers of molecules forenzyme inhibition or receptor binding.

[0079] Pharmaceutical compositions are those substances wherein theactive ingredients are contained in an effective amount to achieve adesired and intended purpose. The determination of an effective dose iswell within the capability of those skilled in the art. For anycompound, the therapeutically effective dose may be estimated initiallyeither in cell culture assays or in animal models. The animal model isalso used to achieve a desirable concentration range and route ofadministration. Such information may then be used to determine usefuldoses and routes for administration in humans.

[0080] A therapeutically effective dose refers to that amount of proteinor inhibitor which ameliorates the symptoms or condition. Therapeuticefficacy and toxicity of such agents may be determined by standardpharmaceutical procedures in cell cultures or experimental animals,e.g., ED50 (the dose therapeutically effective in 50% of the population)and LD50 (the dose lethal to 50% of the population). The dose ratiobetween toxic and therapeutic effects is the therapeutic index, and itmay be expressed as the ratio, LD50/ED50. Pharmaceutical compositionswhich exhibit large therapeutic indexes are preferred. The data obtainedfrom cell culture assays and animal studies are used in formulating arange of dosage for human use.

[0081] Model Systems

[0082] Animal models may be used as bioassays where they exhibit a toxicresponse similar to that of humans and where exposure conditions arerelevant to human exposures. Mammals are the most common models and mosttoxicity studies are performed on rodents such as rats or mice becauseof low cost, availability, and abundant reference toxicologic. Inbredrodent strains provide a convenient model for investigation of thephysiological consequences of under- or over-expression of genes ofinterest and for the development of methods for diagnosis and treatmentof diseases. A rodent strain inbred to over-express a particular genemay also serve as a convenient source of the protein expressed by thatgene.

[0083] Toxicology is the study of the effects of agents on livingsystems. The majority of toxicity studies are performed on rats or miceto help predict the effects of these agents on human health. Observationof qualitative and quantitative changes in physiology, behavior,homeostatic processes, and lethality are used to generate a toxicityprofile and to assess the consequences on human health followingexposure to the agent.

[0084] Toxicological tests measure the effects of a single, repeated, orlong-term exposure of a subject to an agent. Agents may be tested forspecific endpoints such as cytotoxicity, mutagenicity, carcinogenicityand teratogenicity. Degree of response varies according to the route ofexposure (contact, ingestion, injection, or inhalation), age, sex,genetic makeup, and health status of the subject. Toxicokinetic studiestrace the absorption, distribution, metabolism, storage, and excretionof the agent in subject tissues, and toxicodynamic studies chartbiological responses that are consequences of the presence of the agentin subject tissues.

[0085] Genetic toxicology identifies and analyzes the ability of anagent to produce genetic mutations Genotoxic agents usually have commonchemical or physical properties that facilitate interaction with nucleicacids and are most harmful when chromosomal aberrations are passed alongto progeny. Toxicological studies may identify agents that increase thefrequency of structural or functional abnormalities in progeny ifadministered to either parent before conception, to the mother duringpregnancy, or to the developing organism. Mice and rats are mostfrequently used in these tests because of their short reproductive cycleand their capacity to be raised in numbers sufficient to satisfystatistical requirements.

[0086] All toxicology studies on experimental animals involve thepreparation of a form of the agent for administration, the selection ofthe route of administration, and the selection of the species toresemble the species of pharmacological interest. Dose concentrationsare varied to investigate a range of dose-related effects which areidentified, measured, and related to exposure.

[0087] Acute toxicity tests are based on a single administration of theagent to the subject to determine the symptomology or lethality of theagent. Three experiments are conducted: 1) an initial dose-range-findingexperiment, 2) an experiment to narrow the range of effective doses, and3) a final experiment for establishing the dose-response curve.

[0088] Prolonged toxicity tests are based on the repeated administrationof the agent. Rat and dog are commonly used in these studies to providedata from species in different families. With the exception ofcarcinogenesis, there is considerable evidence that daily administrationof an agent at high-dose concentrations for periods of three to fourmonths will reveal most forms of toxicity in adult animals.

[0089] Chronic toxicity tests, with a duration of a year or more, areused to demonstrate either the absence of toxicity or the carcinogenicpotential of an agent. When studies are conducted on rats, a minimum ofthree test groups plus one control group are used, and animals areexamined and monitored at the outset and at intervals throughout theexperiment.

[0090] Transgenic rodents which over-express or under-express a gene ofinterest may be inbred and used to model human diseases or to testtherapeutic or toxic agents. (See, e.g., van Beusechem and Valerio, In:Murray (1992) Transgenesis: Applications of Gene Transfer, John Wiley &Sons Ltd. Chichester, England, pp. 283-289.) To produce the rat or mousemodel, a gene candidate which mimics a human disease is coupled to astrong promoter and injected into a fertilized egg, and the eggtransferred into a pseudopregnant dam. The promoter may be activated ata specific time in a specific tissue type during fetal development orpostnatally. Expression of the transgene is monitored by analysis ofphenotype, tissue-specific mRNA expression, and challenged withexperimental drug therapies. Examples of transgenes used as models ofhuman disease include the investigation of the mutant amyloid precursorprotein and apolipoprotein E genes in familial Alzheimer's Disease(Price and Sisodia (1998) Ann Rev Neurosci 21:479-505).

[0091] Embryonic stem cells (ES) isolated from rodent embryos retain thepotential to form an embryo. When ES cells are placed inside a carrierembryo, they resume normal development and contribute to all tissues ofthe live-born animal. ES cells are the preferred cells used in thecreation of experimental knockout and knockin rodent strains. Mouse EScells, such as the mouse 129/SvJ cell line, are derived from the earlymouse embryo and are grown under culture conditions well known in theart. Vectors for knockout strains contain a disease gene candidatemodified to include a marker gene sequence which disrupts transcriptionand/or translation in vivo. The vector is introduced into ES cells bytransformation methods such as electroporation, liposome delivery,microinjection, and the like which are well known in the art. Theendogenous rodent gene is replaced by the disrupted disease gene throughhomologous recombination and integration during cell division. Thentransformed ES cells are selected, identified, and preferablymicroinjected into mouse cell blastocysts such as those from the C57BL/6mouse strain. The blastocysts are surgically transferred topseudopregnant dams and the resulting chimeric progeny are genotyped andbred to produce heterozygous or homozygous strains.

[0092] ES cells are also used to study the differentiation of variouscell types and tissues in vitro, such as neural cells, hematopoieticlineages, and cardiomyocytes (Bain et al. (1995) Dev Biol 168:342-357;Wiles and Keller (1991) Development 111:259-267; and Klug et al. (1996)J Clin Invest 98:216-224). Recent developments demonstrate that ES cellsderived from human blastocysts may also be manipulated in vitro todifferentiate into eight separate cell lineages, including endoderm,mesoderm, and ectodermal cell types (Thomson (1998) Science282:1145-1147).

[0093] As described herein, the uses of the nucleotide sequences,provided in the Sequence Listing of this application, are exemplary ofknown techniques and are not intended to reflect any limitation on theiruse in any technique that would be known to the person of average skillin the art. Furthermore, the nucleotide sequences provided in thisapplication may be used in molecular biology techniques that have notyet been developed, provided the new techniques rely on properties ofnucleotide sequences that are currently known to the person of ordinaryskill in the art.

[0094] In gene knockout analysis, a region of a human disease genecandidate is enzymatically modified to include a non-mammalian gene suchas the neomycin phosphotransferase gene (neo; Capecchi (1989) Science244:1288-1292). The inserted coding sequence disrupts transcription andtranslation of the targeted gene and prevents biochemical synthesis ofthe disease candidate protein. The modified gene is transformed intocultured embryonic stem cells (described above), the transformed cellsare injected into rodent blastulae, and the blastulae are implanted intopseudopregnant dams. Transgenic progeny are crossbred to obtainhomozygous inbred lines.

[0095] Totipotent ES cells, present in the early stages of embryonicdevelopment, can be used to create knockin humanized animals (pigs) ortransgenic animal models (mice or rats) of human diseases. With knockintechnology, a region of a human gene is injected into animal ES cells,and the human sequence integrates into the animal cell genome byrecombination. Totipotent ES cells which contain the integrated humangene are handled as described above. Inbred animals are studied andtreated to obtain information on the analogous human condition. Thesemethods have been used to model several human diseases. (See, e.g., Leeet al. (1998) Proc Natl Acad Sci 95:11371-11376; Baudoin et al. (1998)Genes Dev 12:1202-1216; and Zhuang et al. (1998) Mol Cell Biol18:3340-3349).

[0096] The field of animal testing deals with data and methodology frombasic sciences such as physiology, genetics, chemistry, pharmacology andstatistics. These data are paramount in evaluating the effects oftherapeutic agents on non-human primates as they can be related to humanhealth. Monkeys are used as human surrogates in vaccine and drugevaluations, and their responses are relevant to human exposures undersimilar conditions. Cynomolgus monkeys (Macaca fascicularis, M. mulatta)and common marmosets (Callithrix jacchlus) are the most common non-humanprimates (NHPs) used in these investigations. Since great cost isassociated with developing and maintaining a colony of NHPs, earlyresearch and toxicological studies are usually carried out in rodentmodels. In studies using behavioral measures such as drug addiction,NHPs are the first choice test animal. In addition, NHPs and individualhumans exhibit differential sensitivities to many drugs and toxins andcan be classified as “extensive metabolizers” and “poor metabolizers” ofthese agents. For this reason, NHPs are the favored models for studyingmetabolism and toxicology of agents acted upon by the cytochrome P₄₅₀family of enzymes.

[0097] In additional embodiments, the nucleotide sequences which encodethe mammalian protein may be used in any molecular biology techniquesthat have yet to be developed, provided the new techniques rely onproperties of nucleotide sequences that are currently known, including,but not limited to, such properties as the triplet genetic code andspecific base pair interactions. All patents and publications cited areincorporated herein by reference.

EXAMPLES

[0098] It is to be understood that this invention is not limited to theparticular machines, materials and methods described. Althoughparticular embodiments are described, equivalent embodiments may be usedto practice the invention. The described embodiments are not intended tolimit the scope of the invention which is limited only by the appendedclaims. The examples below are provided to illustrate the subjectinvention and are not included for the purpose of limiting theinvention. For purposes of example, the preparation of the human corpuscallosum cDNA library, CORPNOT02, is described.

[0099] I Representative cDNA Sequence Preparation

[0100] The human corpus callosum cDNA library CORPNOT02 was constructedfrom tissue obtained from a 74-year-old Caucasian male (specimen#RA95-09-0670; International Institute for the Advancement of Medicine,Exton Pa.) who died from Alzheimer's disease. The frozen tissue washomogenized and lysed in guanidinium isothiocyanate solution using aPOLYTRON homogenizer (PT-3000; Brinkmann Instruments, Westbury N.J.).The lysate was centrifuged over a 5.7 M CsCl cushion using an SW28 rotorin an L8-70M ultracentrifuge (Beckman Coulter, Fullerton Calif.) for 18hours at 25,000 rpm at ambient temperature. The RNA was extracted withacid phenol, pH 4.7, precipitated using 0.3 M sodium acetate and 2.5volumes of ethanol, resuspended in RNAse-free water, and treated withDNAse (Life Technologies) at 37° C. The RNA extraction and precipitationwere repeated as before.

[0101] Messenger RNA (mRNA) was isolated using the OLIGOTEX kit (Qiagen,Valencia Calif.) and used to construct the cDNA library. The mRNA washandled according to the recommended protocols in the SUPERSCRIPTplasmid system (Life Technologies) which contains a NotI primer-adaptordesigned to prime the first strand cDNA synthesis at the poly(A) tail ofmRNAs. Double stranded cDNA was blunted, ligated to EcoRI adaptors anddigested with NotI (New England Biolabs, Beverly Mass.). The cDNAs werefractionated on a SEPHAROSE CL-4B column (APB), and those cDNAsexceeding 400 bp were ligated into the NotI and EcoRI sites of the pINCYplasmid (Incyte Genomics, Palo Alto Calif.). The plasmid was transformedinto DH5α or ELECTROMAX DH10B competent cells (Life Technologies).

[0102] Plasmid DNA was released from the cells and purified using theREAL PREP 96 plasmid kit (Qiagen). The recommended protocol was employedexcept for the following changes: 1) the bacteria were cultured in 1 mlof sterile TERRIFIC BROTH (BD Biosciences, Sparks Md.) withcarbenicillin at 25 mg/l and glycerol at 0.4%; 2) after inoculation, thecultures were incubated for 19 hours and then lysed with 0.3 ml of lysisbuffer; and 3) following isopropanol precipitation, the plasmid DNApellet was resuspended in 0.1 ml of distilled water. After the last stepin the protocol, samples were transferred to a 96-well block for at 4°C.

[0103] The cDNAs were prepared using either a MICROLAB 2200 system(Hamilton) or a HYDRA microdispenser (Robbins Scientific) in combinationwith the DNA ENGINE thermal cyclers (MJ Research) and sequenced by themethod of Sanger and Coulson (1975; J Mol Biol 94:441-448) using eitherABI PRISM 377 (Applied Biosystems) or MEGABACE 1000 (APB) sequencing,systems. Most of the isolates were sequenced according to standard ABIprotocols and kits (Applied Biosytems). The solution volumes were usedat 0.25x-1.0x concentrations. In the alternative, cDNAs were sequencedusing solutions and dyes from APB.

[0104] II Identification, Extension, Assembly, and Analyses of theSequences

[0105] Incyte clone 700145292 from ZOOSEQ database (Incyte Genomics) wasused to identify Incyte Clone 5595953 from the LIFESEQ database (IncyteGenomics). The first pass and extended cDNAs, SEQ ID NOs:3-11, whichcluster with Incyte Clone 5595953 were assembled using Phred/Phrap orCONSED (Green, University of Washington, Seattle Wash.) or the GCGFragment assembly system (Genetics Computer Group (GCG), Madison Wis.).The assembled sequence was searched for open reading frames, and thecoding region was translated using MACDNASIS PRO software (HitachiSoftware Engineering). The full length nucleotide and amino acidsequences were analyzed by BLAST queries against databases such as theGenBank databases, SwissProt, BLOCKS, PRINTS, Prosite, and PFAM and byLASERGENE software (DNASTAR). Functional analyses of the amino acidsequences were performed using MOTIFS (GCG) and HMM algorithms.Antigenic index (Jameson-Wolf analysis) of the amino acid sequences weredetermined using LASERGENE software (DNASTAR). Then, the clones andassembled sequence were compared using BLAST across all mammalianlibraries to identify homologous nucleic acid sequences, SEQ IDNOs:12-25.

[0106] III Sequence Similarity

[0107] Sequence similarity was calculated as percent identity based oncomparisons between at least two nucleic acid or amino acid sequencesusing the clustal method of the MEGALIGN program (DNASTAR). The clustalmethod uses an algorithm which groups sequences into clusters byexamining the distances between all pairs. After the clusters arealigned pairwise, they are realigned in groups. Percent similaritybetween two sequences, sequence A and sequence B, is calculated bydividing the length of sequence A, minus the number of gap residues insequence A, minus the number of gap residues in sequence B, into the sumof the residue matches between sequence A and sequence B, times onehundred. Gaps of very low or zero similarity between the two sequencesare not included.

[0108] IV Northern Analysis

[0109] Northern analysis is a laboratory technique used to detect thepresence of a transcript of a gene and involves the hybridization of alabeled nucleotide sequence to a membrane on which RNAs from aparticular cell type or tissue have been bound.

[0110] Analogous computer techniques applying BLAST were used to searchfor identical or related molecules in nucleotide databases such asGenBank or LIFESEQ (Incyte Genomics). Sequence-based analysis is muchfaster than membrane-based hybridization, and the sensitivity of thecomputer search can be modified to determine whether any particularmatch is categorized as exact or similar. The basis of the search is theproduct score which is defined as: (percent sequence identity×percentmammalian BLAST score) divided by 100. The product score takes intoaccount both the degree of similarity between two sequences and thelength of the sequence match. For example, with a product score of 40,the match will be exact within a 1% to 2% error, and with a productscore of at least 70, the match will be exact. Similar or relatedmolecules are usually identified by selecting those which show productscores between 8 and 40.

[0111] The results of northern analyses are reported is a percentagedistribution of libraries in which the transcript encoding the mammalianprotein occurred. Analysis involved the categorization of cDNA librariesby organ/tissue and disease. The organ/tissue categories includedcardiovascular, dermatologic, developmental, endocrine,gastrointestinal, hematopoietic/immune, musculoskeletal, nervous,reproductive, and urologic. The disease categories included cancer,inflammation/trauma, cell proliferation, and neurological. For eachcategory, the number of libraries expressing the sequence was countedand divided by the total number of libraries across all categories.

[0112] V Extension of Polynucleotides

[0113] The nucleic acid sequence of SEQ ID NO:1 was produced byextension of Incyte cDNA clones using oligonucleotide primers. Oneprimer was synthesized to initiate 5′ extension of the known fragment,and the other, to initiate 3′ extension of the known fragment. Theinitial primers were designed using OLIGO software (Molecular BiologyInsights) to be about 22 to 30 nucleotides in length, to have a GCcontent of about 50%, and to anneal to the target sequence attemperatures of about 68° C. to about 72° C. Any fragment which wouldresult in hairpin structures and primer-primer dimerizations wasavoided. Selected human cDNA libraries were used to extend the sequence.If more than one extension is needed, additional or nested sets ofprimers are designed.

[0114] High fidelity amplification was obtained by performing PCR in96-well plates using the DNA ENGINE thermal cycler (MJ Research). Thereaction mix contained DNA template, 200 nmol of each primer, reactionbuffer containing Mg²⁺, (NH₄)₂SO₄, and β-mercaptoethanol, TAQ DNApolymerase (APB), ELONGASE enzyme (Life Technologies), and Pfu DNApolymerase (Stratagene), with the following parameters for primer pairselected from the plasmid: Step 1: 94° C., 3 min; Step 2: 94° C., 15sec; Step 3: 60° C., 1 min; Step 4: 68° C., 2 min; Step 5: Steps 2, 3,and 4 repeated 20 times; Step 6: 68° C., 5 min; Step 7: storage at 4° C.In the alternative, parameters for the primer pair, T7 andSK+(Stratagene), were as follows: Step 1: 94° C., 3 min; Step 2: 94° C.,15 sec; Step 3: 57° C., 1 min; Step 4: 68° C., 2 min; Step 5: Steps 2,3, and 4 repeated 20 times; Step 6: 68° C., 5 min; and Step 7: storageat 4° C.

[0115] The concentration of DNA in each well was determined bydispensing 100 μl PICOGREEN quantitation reagent (0.25% (v/v); MolecularProbes) dissolved in 1× TE and 0.5 μl of undiluted PCR product into eachwell of an opaque fluorimeter plate (Corning Life Sciences, Acton Mass.)and allowing the DNA to bind to the reagent. The plate was scanned in aFluoroskan II (Labsystems Oy, Helsinki, Finland) to measure thefluorescence of the sample and to quantify the concentration of DNA. A 5μl to 10 μl aliquot of the reaction mixture was analyzed byelectrophoresis on a 1% agarose mini-gel to determine which reactionswere successful in producing longer sequence.

[0116] The extended sequences were desalted, concentrated, transferredto 384-well plates, digested with CviJI cholera virus endonuclease(Molecular Biology Research, Madison Wis.), and sonicated or shearedprior to religation into pUC 18 vector (APB). For shotgun sequencing,the digested fragments were separated on about 0.6-0.8% agarose gels,fragments were excised as visualized under UV light, and agarremoved/digested with AGARACE (Promega). Extended fragments werereligated using T4 ligase (New England Biolabs, Beverly Mass.) into pUC18 vector (APB), treated with Pfu DNA polymerase (Stratagene) to fill-inrestriction site overhangs, and transformed into competent E. colicells. Transformed cells were selected on antibiotic-containing media,and individual colonies were picked and cultured overnight at 37° C. in384-well plates in LB/2× carbenicillin liquid media.

[0117] The cells were lysed, and DNA was amplified using Taq DNApolymerase (APB) and Pfu DNA polymerase (Stratagene) with the followingparameters: Step 1: 94° C., 3 min; Step 2: 94° C., 15 sec; Step 3: 60°C., 1 min; Step 4: 72° C., 2 min; Step 5: steps 2, 3, and 4 repeated 29times; Step 6: 72° C., 5 min; and Step 7: storage at 4° C. DNA wasquantified by PICOGREEN reagent (Molecular Probes) as described above.Samples with low DNA recoveries were reamplified using the conditionsdescribed above. Samples were diluted with 20% dimethysulphoxide (1:2,v/v), and sequenced using DYENAMIC energy transfer sequencing primersand the DYENAMIC DIRECT kit (APB) or the ABI PRISM BIGDYE Terminatorcycle sequencing ready reaction kit (Applied Biosystems).

[0118] In like manner, the nucleotide sequence of SEQ ID NO:1 is used toobtain regulatory sequences using the procedure above, oligonucleotidesdesigned for outward extension, and a genomic library.

[0119] VI Labeling of Probes anti Hybridization Analyses

[0120] Polynucleotide sequences are isolated from a biological sourceand applied to a substrate for standard nucleic acid hybridizationprotocols by one of the following methods. A mixture of target nucleicacids, a restriction digest of genomic DNA, is fractionated byelectrophoresis through all 0.7% agarose gel in 1× TAE[Tris-acetate-ethylenediamine tetraacetic acid (EDTA)] running bufferand transferred to a nylon membrane by capillary transfer using 20×saline sodium citrate (SSC). Alternatively, the target nucleic acids areindividually ligated to a vector and inserted into bacterial host cellsto form a library. Target nucleic acids are arranged on a substrate byone of the following methods. In the first method, bacterial cellscontaining individual clones are robotically picked and arranged on anylon membrane. The membrane is placed on bacterial growth medium, LBagar containing carbenicillin, and incubated at 37° C. for 16 hours.Bacterial colonies are denatured, neutralized, and digested withproteinase K. Nylon membranes are exposed to UV irradiation in aSTRATALINKER UV-crosslinker (Stratagene) to cross-link DNA to themembrane.

[0121] In the second method, target nucleic acids are amplified frombacterial vectors by thirty cycles of PCR using primers complementary tovector sequences flanking the insert. Amplified target nucleic acids arepurified using SEPHACRYL-400 beads (APB). Purified target nucleic acidsare robotically arrayed onto a glass microscope slide (Corning LifeSciences). The slide was previously coated with 0.05% aminopropyl silane(Sigma-Aldrich) and cured at 110° C. The arrayed glass slide(microarray) is exposed to UV irradiation in a STRATALINKERUV-crosslinker (Stratagene).

[0122] cDNA probe sequences are made from mRNA templates. Fivemicrograms of mRNA is mixed with 1 μg random primer (Life Technologies),incubated at 70° C. for 10 minutes, and lyophilized. The lyophilizedsample is resuspended in 50 μl of 1× first strand buffer (cDNA Synthesissystems; Life Technologies) containing a dNTP mix, [α-³²P]dCTP,dithiothreitol, and MMLV reverse transcriptase (Stratagene), andincubated at 42° C. for 1-2 hours. After incubation, the probe isdiluted with 42 μl dH₂O, heated to 95° C. for 3 minutes, and cooled onice. mRNA in the probe is removed by alkaline degradation. The probe isneutralized, and degraded mRNA and unincorporated nucleotides areremoved using a PROBEQUANT G-50 MicroColumn (APB). Probes can be labeledwith fluorescent nucleotides, Cy3-dCTP or Cy5-dCTP (APB), in place ofthe radiolabeled nucleotide, [³²P]dCTP.

[0123] Hybridization is carried out at 65° C. in a hybridization buffercontaining 0.5 M sodium phosphate (pH 7.2), 7% SDS, and 1 mM EDTA. Afterthe substrate is incubated in hybridization buffer at 65° C. for atleast 2 hours, the buffer is replaced with 10 ml of fresh buffercontaining the probe sequences. After incubation at 65° C. for 18 hours,the hybridization buffer is removed, and the substrate is washedsequentially under increasingly stringent conditions, up to 40 mM sodiumphosphate, 1% SDS, 1 mM EDTA at 65° C. To detect signal produced by aradiolabeled probe hybridized on a membrane, the substrate is exposed toa PHOSPHORIMAGER cassette (APB), and the image is analyzed usingIMAGEQUANT data analysis software (APB). To detect signals produced by afluorescent probe hybridized on a microarray, the substrate is examinedby confocal laser microscopy, and images are collected and analyzedusing GEMTOOLS gene expression analysis software (Incyte Genomics).

[0124] VII Complementary Polynucleotides

[0125] Sequences complementary to the polynucleotide, or a fragmentthereof, are used to detect, decrease, or inhibit gene expression.Although use of oligonucleotides comprising from about 15 to about 30base pairs is described, essentially the same procedure is used withlarger or smaller fragments or their derivatives (PNAs).Oligonucleotides are designed using OLIGO software (Molecular BiologyInsights) and SEQ ID NO:1 or its fragments, SEQ ID NO:3-9. To inhibittranscription by preventing promoter binding, a complementaryoligonucleotide is designed to bind to the most unique 5′ sequence, mostpreferably about 10 nucleotides before the initiation codon of the openreading frame. To inhibit translation, a complementarily oligonucleotideis designed to prevent ribosomal binding to the mRNA encoding themammalian protein.

[0126] VIII Expression of the Mammalian Protein

[0127] Expression and purification of the mammalian protein are achievedusing bacterial or virus-based expression systems. For expression inbacteria, cDNA is subcloned into a vector containing an antibioticresistance gene and an inducible promoter that directs high levels ofcDNA transcription. Examples of such promoters include, but are notlimited to, the trp-lac (tac) hybrid promoter and the T5 or T7bacteriophage promoter in conjunction with the lac operator regulatoryelement. Recombinant vectors are transformed into bacterial hosts, e.g.,BL21(DE3). Antibiotic resistant bacteria express the mammalian proteinupon induction with isopropyl beta-D-thiogalactopyranoside. Expressionin eukaryotic cells is achieved by infecting Spodoptera frugiperda (Sf9)insect cells with recombinant baculovirus, Autographica californicanuclear polyhedrosis virus. The nonessential polyhedrin gene ofbaculovirus is replaced with the mammalian cDNA by either homologousrecombination or bacterial-mediated transposition involving transferplasmid intermediates. Viral infectivity is maintained and the strongpolyhedrin promoter drives high levels of cDNA transcription.

[0128] In most expression systems, the mammalian protein is synthesizedas a fusion protein with GST or FLAG, which permits rapid, single-step,affinity-based purification of recombinant fusion protein from crudecell lysates. GST enables the purification of fusion proteins onimmobilized glutathione under conditions that maintain protein activityand antigenicity (APB). Following purification, the GST moiety can beproteolytically cleaved from the mammalian protein at specificallyengineered sites. FLAG, an 8-amino acid peptide, enables immunoaffinitypurification using commercially available monoclonal and polyclonalanti-FLAG antibodies (Eastman Kodak, Rochester N.Y.). 6-His, a stretchof six consecutive histidine residues, enables purification onmetal-chelate resins (Qiagen). Methods for protein expression andpurification are discussed in Ausubel (supra, unit 16). Purifiedmammalian protein obtained by these methods can be used directly in thefollowing activity assay.

[0129] IX Functional Assays

[0130] Protein function is assessed by expressing the sequences encodingLMTF at physiologically elevated levels in mammalian cell culture. Thepolynucleotide is subcloned into pCMV SPORT vector (Life Technologies),which contains the strong cytomegalovirus promoter, and 5-10 μg of thevector is transformed into a endothelial or hematopoietic human cellline using electroporation. An additional 1-2 μg of a plasmid containingsequence encoding CD64-GFP (Clontech) is co-transformed to provide anfluorescent marker to identify transformed cells using flow cytometry.

[0131] The influence of the introduced genes on expression can beassessed using purified populations of these transformed cells. SinceCD64-GFP, which is expressed on the surface of transformed cells, bindsto conserved regions of human immunoglobulin G (IgG), the transformedcells is separated using magnetic beads coated with either human IgG orantibody against CD64 (DYNAL, Lake Success N.Y.). mRNA is purified fromthe cells and analyzed by hybridization techniques.

[0132] X Production of LMTF Specific Antibodies

[0133] LMTF is purified using polyacrylamide gel electrophoresis is usedto immunize rabbits and to produce antibodies using standard protocols.

[0134] Alternatively, the amino acid sequence of LMTF is analyzed usingLASERGENE software (DNASTAR) to determine regions of highimmunogenicity. An immunogenic epitope such as those near the C-terminusor in hydrophilic regions is selected, synthesized, and used to raiseantibodies by means known to those of skill in the art.

[0135] Typically, epitopes of about 15 residues in length are producedusing an ABI 431A peptide synthesizer (Applied Biosystems) usingFmoc-chemistry and coupled to KLH (Sigma-Aldrich) by reaction withN-maleimidobenzoyl-N-hydroxysuccinimide ester to increaseimmunogenicity. Rabbits are immunized with the epitope-KLH complex incomplete Freund's adjuvant. Immunizations are repeated at intervalsthereafter in incomplete Freund's adjuvant. After a sufficient period oftime, antisera are drawn and tested for antipeptide activity. Testinginvolves binding the peptide to plastic, blocking with 1% bovine serumalbumin, reacting with rabbit antisera, washing, and reacting withradio-iodinated goat anti-rabbit IgG. Methods well known in the art areused to determine antibody titer and the amount of complex formation.

[0136] XI Purification of Naturally Occurring Protein Using SpecificAntibodies

[0137] Naturally occurring or recombinant mammalian protein is purifiedby immunoaffinty chromatography using antibodies specific for theprotein. An immunoaffinity column is constructed by covalently couplingthe antibody to CNBr-activated SEPHAROSE resin (APB). Media containingthe protein is passed over the immunoaffinity column, and the column iswashed using high ionic strength buffers in the presence of detergent toallow preferential absorbance of the protein. After coupling, the columnis eluted using a buffer of pH 2-3 or a high concentration of urea orthiocyanate ion to disrupt antibody/protein binding, and the protein iscollected.

[0138] XII Screening Molecules for Specific Binding with thePolynucleotide or Protein

[0139] The nucleic acid sequence, or fragments thereof, or the protein,or portions thereof, are labeled with ³²P-dCTP, Cy3-dCTP, Cy5-dCTP(APB), or BIODIPY or FITC (Molecular Probes), respectively. Libraries ofcandidate molecules previously arranged on a substrate are incubated inthe presence of labeled nucleic acid sequence or protein. Afterincubation, the substrate is washed, and any position on the substrateretailing label, which indicates specific binding or complex formation,is assayed, and the binding molecule is identified. Data obtained usingdifferent concentrations of the nucleic acid or protein are used tocalculate affinity between the labeled nucleic acid or protein and thebound molecule.

[0140] XIII Demonstration of Protein Activity

[0141] LMTF activity is measured by its ability to modulatetranscription of a reporter gene. The assay entails the use of areporter gene construct that consists of a transcription factor responseelement fused upstream to sequences encoding the E. coli β-galactosidaseenzyme (LacZ). Sequences encoding LMTF are subcloned into a mammalianexpression vector containing a strong promoter that drives high levelsof cDNA expression. Vectors of choice include PCMV SPORT (LifeTechnologies) and PCR 3.1 (Invitrogen, Carlsbad Calif.), both of whichcontain the cytomegalovirus promoter. The recombinant vector andreporter gene construct are co-transformed into a human cell line,preferably of neuronal origin, using either liposome formulations orelectroporation. The amount of β-galactosidase enzyme activityassociated with LMTF transfectcd cells, relative to control cellstransformed with the reporter construct alone, is proportional to theamount of transcription modulated by the LMTF gene product.

[0142] All patents and publications mentioned in the specification areincorporated by reference herein. Various modifications and variationsof the described method and system of the invention will be apparent tothose skilled in the art without departing from the scope and spirit ofthe invention. Although the invention has been described in connectionwith specific preferred embodiments, it should be understood that theinvention as claimed should not be unduly limited to such specificembodiments. Indeed, various modifications of the described modes forcarrying out the invention that are obvious to those skilled in thefield of molecular biology or related fields are intended to be withinthe scope of the following claims. TABLE 1 Nucleic Acid IncyteNucleotide Percent SEQ ID NO: Clone Number Length Source LibraryCoverage Identity 3 1479946F6 502 Homo sapiens CORPNOT02  1-502 n/a 43241390F6 529 Homo sapiens COLAUCT01 281-801 n/a 5 1432520R1 562 Homosapiens BEPINON01 599-1178 n/a 6 4534217H1 254 Homo sapiens OVARNOT121152-1414 n/a 7 2191992H1 238 Homo sapiens THYRTUT03 1200-1437 n/a 81320132T1 661 Homo sapiens BLADNOT04 1299-1957 n/a 9 1516707T1 624 Homosapiens PANCTUT01 1445-2070 n/a 10 5595953H1 252 Homo sapiens COLCDTT031592-1845 n/a 11 1988906R6 302 Homo sapiens LUNGAST01 1851-2092 n/a 12700712962H1 144 Macaca fascicularis MNBFNOTO2  12-156 92.4 13700715135H1 274 Macaca fascicularis NBCNOT01 292-564 93.1 14 701253541H1273 Mus musculus MOLUDIT07  560-1032 71.2 15 701252210H1 250 Musmusculus MOLUDIT07 1564-1831 81.0 16 700545683H1 272 Rattus norvegicusRASPNOT01  1-271 52.2 17 700145292H1 257 Rattus norvegicus RAPRNOT01191-488 44.7 18 700861443H1 239 Rattus norvegicus RABGNOT02 338-591 63.219 700225363H1 302 Rattus norvegicus RAKINOT01 479-780 86.4 20700643425H1 286 Rattus norvegicus RABUNOT01 593-879 81.1 21 700525920H1285 Rattus norvegicus RABMNOT01  783-1066 77.2 22 700773927H1 270 Rattusnorvegicus RABONOT01 1001-1197 54.8 23 700513679H1 283 Rattus norvegicusRASNNOT01 1318-1594 72.1 24 700767486H1 266 Rattus norvegicus RAHYNOT011594-1876 73.7 25 700327166H1 199 Rattus norvegicus RASNNOT01 1813-200874.4

[0143]

0 SEQUENCE LISTING <160> NUMBER OF SEQ ID NOS: 27 <210> SEQ ID NO 1<211> LENGTH: 2092 <212> TYPE: DNA <213> ORGANISM: Homo sapiens <220>FEATURE: <221> NAME/KEY: misc_feature <223> OTHER INFORMATION: Incyte IDNo.: 5595953 <400> SEQUENCE: 1 gggcctttta tctcggtgct gccgggggaggcgggaggag gagacaccag gggtggccct 60 gagcgccggc gacacctttc ctggactataaattgagcac ctgggatggg tagggggcca 120 acgcagtcac cgccgtccgc agtcacagtccagccactga ccgcagcagc gcccttgcgt 180 acagccgctt gcagcgagaa cactgaattgccaacgagca ggagagtctc aaggcgcaag 240 aggaggccag ggctcgaccc acagagcaccctcagccatc gcgagtttcc gggcgccaaa 300 gccaggagaa gccgcccatc ccgcagggccggtctgccag cgagacgaga gttggcgagg 360 gcggaggagt gccgggaatc ccgccacaccggctatagcc aggcccccag cgcgggcctt 420 ggagagcgcg tgaaggcggg catccccttgacccggccga ccatccccgt gcccctgcgt 480 ccctgcgctc caacgtccgc gcggccaccatgatgcaaat ctgcgacacc tacaaccaga 540 agcactcgct ctttaacgcc atgaatcgcttcattggcgc cgtgaacaac atggaccaga 600 cggtgatggt gcccagcttg ctgcgcgacgtgcccctggc tgaccccggg ttagacaacg 660 atgttggcgt ggaggtaggc ggcagtggcggctgcctgga ggagcgcacg cccccagtcc 720 ccgactcggg aagcgccaat ggcagctttttcgcgccctc tcgggacatg tacagccact 780 acgtgcttct caagtccatc cgcaacgacatcgagtgggg ggtcctgcac cagccgcctc 840 caccggctgg gagcgaggag ggcagtgcctggaagtccaa ggacatcctg gtggacctgg 900 gccacttgga gggtgcggac gccggcgaagaagacctgga acagcagttc cactaccacc 960 tgcgcgggct gcacactgtg ctctcgaaactcacgcgcaa agccaacatc ctcactaaca 1020 gatacaagca ggagatcggc ttcggcaattggggccactg aggcgtggcg cccgtggctg 1080 cccagcacct tcttcgaccc atctcaccctctctcattcc tcaaagcttt tttttttttt 1140 cctggctggg gggcgggaag ggcagactgcaaactggggg gctgcgtacg tgcaggaggc 1200 gcggtggggc tgcgtggagg agggggccacgtgtgagaga gaagaaaatg gtggccggag 1260 atgggagggc ccaaggaacc tcctgggagggggcctgcat tctatgttgg tgggaatggg 1320 actgggctga cgccctgcat tcagcctgtgcctttcctgg ggtttctttt ctgttctttt 1380 cggaggagag ggcccgagaa ggggccataccagggcgcgg cgctgggttg ccacacttgg 1440 gaaagcagcc cggagctggg tgctggggaaggcggggcgc gtagcctccc gccgccctgc 1500 ggttgggccg gtggaggccc aggcgttgctaggattgcat cagttttcct gtttgcacta 1560 tttctttttg taacattggc cctgtgtgaagtatttcgaa tctcctcctt gctctgaaac 1620 ttcagcgatt ccattgtgat aagcgcacaaacagcactgt ctgtcggtaa tcggtactac 1680 tttattaatg attttctgtt acactgtatagtagtcctat ggcaccccca ccccatccct 1740 ttcgtgccac tcccgtcccc acccccaccccagtgtgtat aagctggcat ttcgccagct 1800 tgtacgtagc ttgccactca gtgaaaataataacattatt atgagaaagt ggacttaacc 1860 gaaatggaac caactgacat tctatcgtgttgtacataga atgatgaagg gttccactgt 1920 tgttgtatgt cttaaattta tttaaaactttttttaatcc agatgtagac tatattctaa 1980 aaaataaaaa agcaaatgtg tcaactaaattggacaagcg tctggtcctc attaatctgc 2040 caatgaatgg tttcgtcatt aaataaaaatcaatttaatt gatttactag ca 2092 <210> SEQ ID NO 2 <211> LENGTH: 183 <212>TYPE: PRT <213> ORGANISM: Homo sapiens <220> FEATURE: <221> NAME/KEY:misc_feature <223> OTHER INFORMATION: Incyte ID No.: 5595953 <400>SEQUENCE: 2 Met Met Gln Ile Cys Asp Thr Tyr Asn Gln Lys His Ser Leu Phe1 5 10 15 Asn Ala Met Asn Arg Phe Ile Gly Ala Val Asn Asn Met Asp Gln 2025 30 Thr Val Met Val Pro Ser Leu Leu Arg Asp Val Pro Leu Ala Asp 35 4045 Pro Gly Leu Asp Asn Asp Val Gly Val Glu Val Gly Gly Ser Gly 50 55 60Gly Cys Leu Glu Glu Arg Thr Pro Pro Val Pro Asp Ser Gly Ser 65 70 75 AlaAsn Gly Ser Phe Phe Ala Pro Ser Arg Asp Met Tyr Ser His 80 85 90 Tyr ValLeu Leu Lys Ser Ile Arg Asn Asp Ile Glu Trp Gly Val 95 100 105 Leu HisGln Pro Pro Pro Pro Ala Gly Ser Glu Glu Gly Ser Ala 110 115 120 Trp LysSer Lys Asp Ile Leu Val Asp Leu Gly His Leu Glu Gly 125 130 135 Ala AspAla Gly Glu Glu Asp Leu Glu Gln Gln Phe His Tyr His 140 145 150 Leu ArgGly Leu His Thr Val Leu Ser Lys Leu Thr Arg Lys Ala 155 160 165 Asn IleLeu Thr Asn Arg Tyr Lys Gln Glu Ile Gly Phe Gly Asn 170 175 180 Trp GlyHis <210> SEQ ID NO 3 <211> LENGTH: 502 <212> TYPE: DNA <213> ORGANISM:Homo sapiens <220> FEATURE: <221> NAME/KEY: unsure <222> LOCATION: 181,317, 363, 393, 402, 421, 432, 458, 461, 463, 470, 478, 482, 494 <223>OTHER INFORMATION: a or g or c or t, unknown, or other <220> FEATURE:<221> NAME/KEY: misc_feature <223> OTHER INFORMATION: Incyte ID No.:1479946F6 <400> SEQUENCE: 3 gggcctttta tctcggtgct gccgggggag gcgggaggaggagacaccag gggtggccct 60 gagcgccggc gacacctttc ctggactata aattgagcacctgggatggg tagggggcca 120 acgcatcacc gccgtccgca gtcacagtcc agccactgaccgcagcagcg cccttgcgta 180 nagccgcttg cagcgagaac actgaattgc caacgagcaggagagtctca aggcgcaaga 240 ggaggccagg ggctcgaccc acagagcacc ctcagccatcgcgagtttcc gggcgccaaa 300 gccaggagaa gccgccnatc ccgcaaggcc cggtctgccagcgagacgag attggcgagg 360 gcngaagagt gccgggaatc ccgccacacc ggntatagcaancccccagc gcgggctttg 420 naaacgcctg angcgggcat cccttgaccg gcgacatnccntnccctgcn tcctgggntc 480 ancttcgggc gcancatatt ac 502 <210> SEQ ID NO 4<211> LENGTH: 529 <212> TYPE: DNA <213> ORGANISM: Homo sapiens <220>FEATURE: <221> NAME/KEY: unsure <222> LOCATION: 48, 172, 214, 348, 364,414, 417, 428, 430, 436, 471, 491, 495, 503, 511, 523 <223> OTHERINFORMATION: a or g or c or t, unknown, or other <220> FEATURE: <221>NAME/KEY: misc_feature <223> OTHER INFORMATION: Incyte ID No.: 3241390F6<400> SEQUENCE: 4 gcgagtttcc gggcgccaaa gccaggagaa gccgcccatc ccgcaggncaggtctgccag 60 cgagacgaga gttggcgagg gcggaggagt gccgggaatc ccgccacaccggctatagcc 120 aggcccccag cgcgggcctt ggagagcgcg tgaaggcggg catccccttganccggccga 180 ccatccccgt gcccctgcgt ccctgcgctc caangtccgc gcggccaccatgatgcaaat 240 ctgcgacacc tacaaccaga agcactcgct ctttaacgcc atgaatcgcttcattggcgc 300 cgtgaacaac atggaccaga cggtgatggt gcccagcttg tgcgcgangtgcccctggct 360 gacnccgggt tagacaacga tgttggcgtg gaggtaagcg gcaatggcggcttnctngag 420 gagcgcangn ccccanttcc cgactcggga agcgccaatg gagcttttttnggggcctct 480 tggggacaat nttanaagcc aantaagtgg nttctcaaag ttncatccg 529<210> SEQ ID NO 5 <211> LENGTH: 562 <212> TYPE: DNA <213> ORGANISM: Homosapiens <220> FEATURE: <221> NAME/KEY: unsure <222> LOCATION: 288, 289,291, 293, 313, 434, 514, 560 <223> OTHER INFORMATION: a or g or c or t,unknown, or other <220> FEATURE: <221> NAME/KEY: misc_feature <223>OTHER INFORMATION: Incyte ID No.: 1432520R1 <400> SEQUENCE: 5 gacggtgatggtgcccagct tgctgcgcga cgtgcccctg gctgaccccg ggttagacaa 60 cgatgttggcgtggaggtag gcggcagtgg cggctgcctg gaggagcgca cgcccccagt 120 ccccgactcgggaagcgcca atggcagctt tttcgcgccc tctcgggaca tgtacagcca 180 ctacgtgcttctcaagtcca tccgcaacga catcgagtgg ggggtcctgc accagccgcc 240 tccaccggctgggagcgagg agggcagtgc ctggaagtcc aaggacannc ngntggacct 300 gggccacttgganggtgcgg acgccggcga agaagacctg gaacagcagt tccactacca 360 cctgcgcgggctgcacactg tgtctcgaaa ctcacgcgca aagccaacat cctcactaac 420 agtacaagcaggantcggtt cggaattggg ggcactgagg cgtggcgccc gtggctgccc 480 agaacttttcgaccatctaa cctctctatt cctnaagctt tttttttttc cggctggggg 540 cggaaggcaactgcaaattn gg 562 <210> SEQ ID NO 6 <211> LENGTH: 254 <212> TYPE: DNA<213> ORGANISM: Homo sapiens <220> FEATURE: <221> NAME/KEY: unsure <222>LOCATION: 80, 215, 238, 239, 242, 243, 244, 249 <223> OTHER INFORMATION:a or g or c or t, unknown, or other <220> FEATURE: <221> NAME/KEY:misc_feature <223> OTHER INFORMATION: Incyte ID No.: 4534217H1 <400>SEQUENCE: 6 gggcagactg caaactgggg ggctgcgtac gtgcaggagg cgcggtggggctgcgtggag 60 gagggggcca cgtgtgagan agaagaaaat ggtggccgga gatgggagggcccaaggaac 120 ctcctgggag ggggcctgca ttctatgttg gtgggaatgg gactgggctgacgccctgca 180 ttcagcctgt gcctttcctg gggtttcttt tctgntcttt tcggaggagaaggcccgnna 240 annngccana ccaa 254 <210> SEQ ID NO 7 <211> LENGTH: 238<212> TYPE: DNA <213> ORGANISM: Homo sapiens <220> FEATURE: <221>NAME/KEY: unsure <222> LOCATION: 19, 133 <223> OTHER INFORMATION: a or gor c or t, unknown, or other <220> FEATURE: <221> NAME/KEY: misc_feature<223> OTHER INFORMATION: Incyte ID No.: 2191992H1 <400> SEQUENCE: 7cgcggtgggg ctgcgtggng gagggggcca cgtgtgagag agaagaaaat ggtggccgga 60gatgggaggg cccaaggaac ctcctgggag ggggcctgca ttctatgttg gtgggaatgg 120gactgggctg acnccctgca ttcagcctgt gcctttcctg gggtttcttt tctgttcttt 180tcggaggaga gggcccgaga aggggccata ccagggcgcg gcgctgggtt gccacact 238<210> SEQ ID NO 8 <211> LENGTH: 661 <212> TYPE: DNA <213> ORGANISM: Homosapiens <220> FEATURE: <221> NAME/KEY: unsure <222> LOCATION: 2, 5, 65,68, 70, 75, 96, 104, 107, 110, 116, 131, 194, 195, 198, 199, 200, 204,225, 227, 235, 309, 469, 534, 536, 559, 563, 591, 603, 604, 612, 619,632, 657 <223> OTHER INFORMATION: a or g or c or t, unknown, or other<220> FEATURE: <221> NAME/KEY: misc_feature <223> OTHER INFORMATION:Incyte ID No.: 1320132T1 <400> SEQUENCE: 8 tnganttaaa aaaagttttaaataaattta agacatacca acaacagtgg aaaccttcat 60 cattntangn acaanacgatagaatgtcag ttgggnccat ttcngtnaan tccacnttct 120 cataataatg ntattattttcactgagtgg caagctacgt acaagctggc gaaatgccag 180 cttatacaca ctgnngtnnnggtngggacg ggagtggcac gaaangnatg gggtnggggt 240 gccataggac tactatacagtgtaacagaa aatcattaat aaagtagtac cgattaccga 300 cagacagtnc tgtttgtgcgcttatcacaa tggaatcgct gaagtttcag agcaaggagg 360 agattcgaaa tacttcacacagggccaatg ttacaaaaag aaatagtgca aacaggaaaa 420 ctgatgcaat cctagcaacgcctgggcctc caccggccca accgcaggnt gcgggaggct 480 acgcgccccg ccttccccagcacccagctc cgggctgctt tcccaagtgt tgcnanccaa 540 cgccgcgccc tggtattgncccntctcggg cctttcctcc gaaaagaacc ngaaagaacc 600 ccnngaaagg cncaggctnaattcagggcg tnacccagtt ccattcccac caacttngat 660 t 661 <210> SEQ ID NO 9<211> LENGTH: 624 <212> TYPE: DNA <213> ORGANISM: Homo sapiens <220>FEATURE: <221> NAME/KEY: unsure <222> LOCATION: (302)...(307), 330, 332,334, 560, 572, 574, 577, 593, 609 <223> OTHER INFORMATION: a or g or cor t, unknown, or other <220> FEATURE: <221> NAME/KEY: misc_feature<223> OTHER INFORMATION: Incyte ID No.: 1516707T1 <400> SEQUENCE: 9atttttattt aatgacgaaa ccattcattg gcagattaat gaggaccaga cgcttttcca 60atttagttga cacatttgct tttttatttt ttagaatata gtctacatct ggattaaaaa 120aagttttaaa taaatttaag acatacaaca acagtggaac ccttcatcat tctatgtaca 180acacgataga atgtcagttg gttccatttc ggttaagtcc actttctcat aataatgtta 240ttattttcac tgagtggcaa gctacgtaca agctggcgaa atgccagctt atacacactg 300gnnnnnnggt ggggacggga gtggcacgan angnatgggg tgggggtgcc ataggactac 360tatacagtgt aacagaaaat cattaataaa gtagtaccga ttaccgacag acagtgctgt 420ttgtgcgctt atcacaatgg aatcgctgaa gtttcagagc aaggaggaga ttcgaaatac 480ttcacacagg gccaatgtta caaaaagaaa tagtgcaaac aggaaaactg atgcaatcct 540agcaacgcct gggcctccan cggcccaacc gnangcngcg ggaggctacg cgncccgctt 600ccccagcanc cagctccggg gtgt 624 <210> SEQ ID NO 10 <211> LENGTH: 252<212> TYPE: DNA <213> ORGANISM: Homo sapiens <220> FEATURE: <221>NAME/KEY: unsure <222> LOCATION: 186, 189, 195, 196, 200, 204, 208, 213,222, 226, 228, 236, 244, 248, 251 <223> OTHER INFORMATION: a or g or cor t, unknown, or other <220> FEATURE: <221> NAME/KEY: misc_feature<223> OTHER INFORMATION: Incyte ID No.: 5595953H1 <400> SEQUENCE: 10atttcgaatc tcctccttgc tctgaaactt cagcgattcc attgtgataa gcgcacaaac 60agcactgtct gtcggtaatc ggtactactt tattaatgat tttctgttac actgtatagt 120agtcctatgg cacccccacc ccatcccttt cgtgccactc ccgtccccac ccccacccca 180gggggntang cgggnntttn gccngctnga cgnagctggc cnctcngnga aaatantacc 240tttnttgngg ng 252 <210> SEQ ID NO 11 <211> LENGTH: 302 <212> TYPE: DNA<213> ORGANISM: Homo sapiens <220> FEATURE: <221> NAME/KEY: unsure <222>LOCATION: 89, 186, 251, 252, 254, 255, 258, 259, 260, 267, 268, 278,281, 283, 286, 291, 292, 294, 297 <223> OTHER INFORMATION: a or g or cor t, unknown, or other <220> FEATURE: <221> NAME/KEY: misc_feature<223> OTHER INFORMATION: Incyte ID No.: 1988906R6 <400> SEQUENCE: 11ggacttaacc gaaatggaac caactgacat tctatcgtgt tgtacataga atgatgaagg 60gttccactgt tgttgtatgt cttaaattna tttaaaactt tttttaatcc agatgtagac 120tatattctaa aaaataaaaa agcaaatgtg tcaactaaat tggacaagcg tctggtcctc 180attaanctgc caatgaatgg tttcgtcatt aaataaaaat caatttaatt gatttactag 240caaaagtaga nnannaannn aaaaaannaa aaaaaaanac naangntaac nntnccnaaa 300 aa302 <210> SEQ ID NO 12 <211> LENGTH: 144 <212> TYPE: DNA <213> ORGANISM:Macaca fascicularis <220> FEATURE: <221> NAME/KEY: unsure <222>LOCATION: 25 <223> OTHER INFORMATION: a or g or c or t, unknown, orother <220> FEATURE: <221> NAME/KEY: misc_feature <223> OTHERINFORMATION: Incyte ID No.: 700712962H1 <400> SEQUENCE: 12 ctcgatgctgccgggggagg cgggnggagg agacaccagg ggtggccctg agcaccggcg 60 acacctttcctggactataa attgagcacc tgggatgggt agggggtcaa cgcatcaccg 120 ccgcccgcagtcacagtccg gcca 144 <210> SEQ ID NO 13 <211> LENGTH: 274 <212> TYPE: DNA<213> ORGANISM: Macaca fascicularis <220> FEATURE: <221> NAME/KEY:unsure <222> LOCATION: 19, 253 <223> OTHER INFORMATION: a or g or c ort, unknown, or other <220> FEATURE: <221> NAME/KEY: misc_feature <223>OTHER INFORMATION: Incyte ID No.: 700715135H1 <400> SEQUENCE: 13gacgccaaag ccaggagang ccgcccatcc cgcaggtccg gttctgccag cgagacgaga 60gttggcgagg gcggaggagt gccgggaatc ccgccacacc ggctatagcc aggcccccag 120cgcgggcctt ggagagagcg tgaaggcggg catccctttg acccggccga ccatccccgt 180gtctctgcgt ccctgcgctc cagcgcccgc gcggccacca tgatgcaaat ctgcgacacc 240tacaaccaga agnactcgct ctttaacggc atga 274 <210> SEQ ID NO 14 <211>LENGTH: 273 <212> TYPE: DNA <213> ORGANISM: Mus musculus <220> FEATURE:<221> NAME/KEY: unsure <222> LOCATION: 245 <223> OTHER INFORMATION: a org or c or t, unknown, or other <220> FEATURE: <221> NAME/KEY:misc_feature <223> OTHER INFORMATION: Incyte ID No.: 701253541H1 <400>SEQUENCE: 14 cccgggtaca tgtacagcca ctacgtgctg ctcaagtcca tccgcaatgatatcgagtgg 60 ggagtcctgc accagccttc gtctccgccg gccgggagcg aggagagcacctggaagccc 120 aaggacatcc tggtgggcct gagtcacttg gagagcgcgg atgcggcgaggaagatctgg 180 agcagcagtt ccactaccac ctgcgcgggc tgcacaccgt gctctccaaactcacccgaa 240 aagcnaacat cctcaccatt agatacaagc agg 273 <210> SEQ ID NO15 <211> LENGTH: 250 <212> TYPE: DNA <213> ORGANISM: Mus musculus <220>FEATURE: <221> NAME/KEY: unsure <222> LOCATION: 45 <223> OTHERINFORMATION: a or g or c or t, unknown, or other <220> FEATURE: <221>NAME/KEY: misc_feature <223> OTHER INFORMATION: Incyte ID No.:701252210H1 <400> SEQUENCE: 15 ctttttgtaa cagtgaccct gtcttaagtctttcagatct ctttnctttg aaacttcgtc 60 gattccattg tgataagcgc acaaacagcactgttggtaa ccggtactac tttattaatg 120 attttctgtt acactgtaca gtagtcctgtggcaccctat ccctttcacg ccacccctcc 180 cccgcccgtg tgtgtaaact ggcgatgtgccagctaggat gaagcttgcc actcggctag 240 cgaaaataat 250 <210> SEQ ID NO 16<211> LENGTH: 272 <212> TYPE: DNA <213> ORGANISM: Rattus norvegicus<220> FEATURE: <221> NAME/KEY: unsure <222> LOCATION: 56, 64, 67, 84,209, 234, 249 <223> OTHER INFORMATION: a or g or c or t, unknown, orother <220> FEATURE: <221> NAME/KEY: misc_feature <223> OTHERINFORMATION: Incyte ID No.: 700545683H1 <400> SEQUENCE: 16 gaccttttatctgtgctgct ggaggaggta ggaggaggag acatcagggg tggtcntggg 60 gcgnctnggacacctatcct ggantataaa ttgagcacct gggatgcagc agggggccga 120 agcagccaccatcacccata ctcacagtcc gatcagtgac cgcagcagcg cccttgggca 180 gccaccgtgccgcaactacg agcactgana accaggggat ttcgcagtgc aagngatcaa 240 ggctagacncaaccacctac catcctcgtg ag 272 <210> SEQ ID NO 17 <211> LENGTH: 257 <212>TYPE: DNA <213> ORGANISM: Rattus norvegicus <220> FEATURE: <221>NAME/KEY: misc_feature <223> OTHER INFORMATION: Incyte ID No.:700145292H1 <400> SEQUENCE: 17 gcaactcgag cactgagaac caggggatttcgcagtgcaa gagatcaagg ctagacccaa 60 ccacctaaca tcctcgtgag ccaaagcttagagcagccgc gcatcaggaa gggctgaact 120 gagacagaag gaagagttag agagggcggagaaggatctg ggaatccagt cacaccggct 180 tcaagcaggc tcccggcatt agcgtttgaaggcgggcatc gccagaggtc tatctcggtg 240 taccagtgtc cctgtgt 257 <210> SEQ IDNO 18 <211> LENGTH: 239 <212> TYPE: DNA <213> ORGANISM: Rattusnorvegicus <220> FEATURE: <221> NAME/KEY: unsure <222> LOCATION: 16, 68<223> OTHER INFORMATION: a or g or c or t, unknown, or other <220>FEATURE: <221> NAME/KEY: misc_feature <223> OTHER INFORMATION: Incyte IDNo.: 700861443H1 <400> SEQUENCE: 18 gacagaagga agagtnagag agggcggagaaggatctggg aatccagtca caccggcttc 60 aagcaggntt cccggcatta gcgtttgaaggcgggcatcg ccagaggtct atctcggtgt 120 accagtgtcc ctgtgtttcc gcgcccgctcggccaccatg atgcaaatct gcgacacata 180 caaccagaag cactcgctct ttaacgccatgaatcgcttc attggcgcgg tgaacaaca 239 <210> SEQ ID NO 19 <211> LENGTH: 302<212> TYPE: DNA <213> ORGANISM: Rattus norvegicus <220> FEATURE: <221>NAME/KEY: misc_feature <223> OTHER INFORMATION: Incyte ID No.:700225363H1 <400> SEQUENCE: 19 gtccctgtgt ttccgcgccc gctcggccaccatgatgcaa atctgcgaca catacaacca 60 gaagcactcg ctctttaacg ccatgaatcgcttcattggc gcggtgaaca acatggacca 120 gacggtgatg gtgcccagtc tgctgcgcgatgtacccctg tccgagccgg atctagacaa 180 cgaggtcagc gtggaggtag gcggcagtggcagctgcctg gaggagcgca cgaccccggc 240 cccaagcccg ggcagcgcca atggaagctttttcgcgccc tcccgggaca tgtacagcca 300 ct 302 <210> SEQ ID NO 20 <211>LENGTH: 286 <212> TYPE: DNA <213> ORGANISM: Rattus norvegicus <220>FEATURE: <221> NAME/KEY: unsure <222> LOCATION: 62, 283 <223> OTHERINFORMATION: a or g or c or t, unknown, or other <220> FEATURE: <221>NAME/KEY: misc_feature <223> OTHER INFORMATION: Incyte ID No.:700643425H1 <400> SEQUENCE: 20 ggaccagacg gtgatggtgc ccagtctgctgcgcgatgta cccctgtccg agccggatct 60 anacaacgag gtcagcgtgg aggtaggcggcagtggcagc tgcctggagg agcgcacgac 120 cccggcccca agcccgggca gcgccaatggaagctttttc gcgccctccc gggacatgta 180 cagccactac gtgctgctca agtccatccgcaacgatatt gagtggggag tcctgcacca 240 gccttcgtcc ccgccggctg ggagtgaggagggcacctgg aanccc 286 <210> SEQ ID NO 21 <211> LENGTH: 285 <212> TYPE:DNA <213> ORGANISM: Rattus norvegicus <220> FEATURE: <221> NAME/KEY:unsure <222> LOCATION: 72, 103, 154, 172, 224, 263, 264, 283 <223> OTHERINFORMATION: a or g or c or t, unknown, or other <220> FEATURE: <221>NAME/KEY: misc_feature <223> OTHER INFORMATION: Incyte ID No.:700525920H1 <400> SEQUENCE: 21 gtgctgctca agtccatccg caacgatattgagtggggag tcctgcacca gccttgcgtc 60 cccgccggct gngagtgagg agtggcacctggaagcccaa ggncatcctg gtgggcctga 120 gccacttgga gagcacggat gcgggcgaggaagntctgga gcagcagttc cnctaccacc 180 tgcgcgggct gcacaccgtg ctctccaaactcacccgcaa agcnaacatc cttaacaaca 240 gatacaagca ggagatcggc ttnntaatgggggccattga ggngg 285 <210> SEQ ID NO 22 <211> LENGTH: 270 <212> TYPE:DNA <213> ORGANISM: Rattus norvegicus <220> FEATURE: <221> NAME/KEY:unsure <222> LOCATION: 87 <223> OTHER INFORMATION: a or g or c or t,unknown, or other <220> FEATURE: <221> NAME/KEY: misc_feature <223>OTHER INFORMATION: Incyte ID No.: 700773927H1 <400> SEQUENCE: 22ggccaacatc cttaccaaca gatacaagca ggagatcggc ttcagtaatt ggggccactg 60aggcggggtt gtccccgctg cccagcnccc tctcgggtcg gctctaccac ccccctctct 120ttcctccaaa ctattttctt cctggttgtg gggcgcgaag ggcacgctgt aaagttgggc 180tgtgtacttg gtggggtttg tgtggagaaa acagagcaga gagcagagga aatatcgcca 240gagagggggg ttcaaagacc cccggagggc 270 <210> SEQ ID NO 23 <211> LENGTH:283 <212> TYPE: DNA <213> ORGANISM: Rattus norvegicus <220> FEATURE:<221> NAME/KEY: misc_feature <223> OTHER INFORMATION: Incyte ID No.:700513679H1 <400> SEQUENCE: 23 ggaactgggc cgatgtcctt cattcagcctgtgcctttct tggggtttct tttctctttt 60 tctttccgga agagaagggc ctgagaaagggccatgccag ggcacagcgc tgggttgcca 120 cacttgggag ggcagcttct agctgggtgctcgggggagg cggggcacag cctcctgccc 180 gccctgcttt gagctgcaag aggaggccttggcgttgcta ggattgcgtc agttttcctg 240 tttgcactat ttctttttgt aacagtgaccctgtcttaag tat 283 <210> SEQ ID NO 24 <211> LENGTH: 266 <212> TYPE: DNA<213> ORGANISM: Rattus norvegicus <220> FEATURE: <221> NAME/KEY:misc_feature <223> OTHER INFORMATION: Incyte ID No.: 700767486H1 <400>SEQUENCE: 24 tttcagatct ttttgctttg aaacttcgtc gattccattg tgataagcgcacaagcagca 60 ctgttggtaa ccggtactac tttattaatg attttctgtt acactgtacagtagtcctat 120 ggcaccccat ccctttcacg ccacccctcc cccaccccgt gtgtgtaaactggtgacgtg 180 ccagctagga tgaagcttgc cactcggcca gcgaaaataa taacattattgtgagaaagt 240 ggatttatct aaagtggaac caactg 266 <210> SEQ ID NO 25 <211>LENGTH: 199 <212> TYPE: DNA <213> ORGANISM: Rattus norvegicus <220>FEATURE: <221> NAME/KEY: unsure <222> LOCATION: 8, 36, 40 <223> OTHERINFORMATION: a or g or c or t, unknown, or other <220> FEATURE: <221>NAME/KEY: misc_feature <223> OTHER INFORMATION: Incyte ID No.:700327166H1 <400> SEQUENCE: 25 agccactngg ccagcgaaaa taataacattattgtnagan agtggattta tctaatggaa 60 ccaactgaca ttctatctgt gttgtacgtagaatgatgaa gggctccact gttgttatat 120 gtcttgttta tttaaaactt ttttttaatccagatgtaga ctatattcta aaaaataaaa 180 gctcagatgt gttaaccac 199 <210> SEQID NO 26 <211> LENGTH: 152 <212> TYPE: PRT <213> ORGANISM: Danio rerio<220> FEATURE: <221> NAME/KEY: misc_feature <223> OTHER INFORMATION:GenBank ID No: g861207 <400> SEQUENCE: 26 Met Gln Met Ser Glu Pro LeuSer Gln Lys Asn Ala Leu Tyr Thr 1 5 10 15 Ala Met Asn Arg Phe Leu GlyAla Val Asn Asn Met Asp Gln Thr 20 25 30 Val Met Val Pro Ser Leu Leu ArgAsp Val Pro Leu Asp Gln Glu 35 40 45 Lys Glu Gln Gln Lys Leu Thr Asn AspPro Gly Ser Tyr Leu Arg 50 55 60 Glu Ala Glu Ala Asp Met Tyr Ser Tyr TyrSer Gln Leu Lys Ser 65 70 75 Ile Arg Asn Asn Ile Glu Trp Gly Val Ile ArgSer Glu Asp Gln 80 85 90 Arg Arg Lys Lys Asp Thr Ser Ala Ser Glu Pro ValArg Thr Glu 95 100 105 Glu Glu Ser Asp Met Asp Leu Glu Gln Leu Leu GlnPhe His Leu 110 115 120 Lys Gly Leu His Gly Val Leu Ser Gln Leu Thr SerGln Ala Asn 125 130 135 Asn Leu Thr Asn Arg Tyr Lys Gln Glu Ile Gly IleSer Gly Trp 140 145 150 Gly Gln <210> SEQ ID NO 27 <211> LENGTH: 150<212> TYPE: PRT <213> ORGANISM: Mus musculus <220> FEATURE: <221>NAME/KEY: misc_feature <223> OTHER INFORMATION: GenBank ID No: g1171574<400> SEQUENCE: 27 Met Gln Val Leu Thr Lys Arg Tyr Pro Lys Asn Cys LeuLeu Thr 1 5 10 15 Val Met Asp Arg Tyr Ser Ala Val Val Arg Asn Met GluGln Val 20 25 30 Val Met Ile Pro Ser Leu Leu Arg Asp Val Gln Leu Ser GlyPro 35 40 45 Gly Gly Ser Val Gln Asp Gly Ala Pro Asp Leu Tyr Thr Tyr Phe50 55 60 Thr Met Leu Lys Ser Ile Cys Val Glu Val Asp His Gly Leu Leu 6570 75 Pro Arg Glu Glu Trp Gln Ala Lys Val Ala Gly Asn Glu Thr Ser 80 8590 Glu Ala Glu Asn Asp Ala Ala Glu Thr Glu Glu Ala Glu Glu Asp 95 100105 Arg Ile Ser Glu Glu Leu Asp Leu Glu Ala Gln Phe His Leu His 110 115120 Phe Cys Ser Leu His His Ile Leu Thr His Leu Thr Arg Lys Ala 125 130135 Gln Glu Val Thr Arg Lys Tyr Gln Glu Met Thr Gly Gln Val Leu 140 145150

What is claimed is:
 1. An isolated polypeptide selected from the group consisting of: a) a polypeptide comprising an amino acid sequence of SEQ ID NO:2, b) a polypeptide comprising a naturally occurring an amino acid sequence at least 90% identical to an amino acid sequence of SEQ ID NO:2, c) a biologically active fragment of a polypeptide having an amino acid sequence of SEQ ID NO:2, and d) an immunogenic fragment of a polypeptide having an amino acid sequence of SEQ ID NO:2.
 2. An isolated polypeptide of claim 1, comprising an amino acid sequence of SEQ ID NO:2.
 3. An isolated polynucleotide encoding a polypeptide of claim
 1. 4. An isolated polynucleotide encoding a polypeptide of claim
 2. 5. An isolated polynucleotide of claim 4, having a sequence of SEQ ID NO:1.
 6. A recombinant polynucleotide comprising a promoter sequence operably linked to a polynucleotide of claim
 3. 7. A cell transformed with a recombinant polynucleotide of claim
 6. 8. A transgenic organism comprising a recombinant polynucleotide of claim
 6. 9. A method of producing a polypeptide of claim 1, the method comprising: a) culturing a cell under conditions suitable for expression of the polypeptide, wherein said cell is transformed with a recombinant polynucleotide, and said recombinant polynucleotide comprises a promoter sequence operably linked to a polynucleotide encoding the polypeptide of claim 1, and b) recovering the polypeptide so expressed.
 10. A method of claim 9, wherein the polypeptide comprises the amino acid sequence of SEQ ID NO:2.
 11. An isolated antibody which specifically binds to a polypeptide of claim
 1. 12. An isolated polynucleotide selected from the group consisting of: a) a polynucleotide comprising a polynucleotide sequence of SEQ ID NO:1, b) a polynucleotide comprising a naturally occurring polynucleotide sequence at least 90% identical to a polynucleotide sequence of SEQ ID NO:1, c) a polynucleotide complementary to a polynucleotide of a), d) a polynucleotide complementary to a polynucleotide of b) and e) an RNA equivalent of a)-d).
 13. An isolated polynucleotide comprising at least 60 contiguous nucleotides of a polynucleotide of claim
 12. 14. A method of detecting a target polynucleotide in a sample, said target polynucleotide having a sequence of a polynucleotide of claim 12, the method comprising: a) hybridizing the sample with a probe comprising at least 20 contiguous nucleotides comprising a sequence complementary to said target polynucleotide in the sample, and which probe specifically hybridizes to said target polynucleotide, under conditions whereby a hybridization complex is formed between said probe and said target polynucleotide or fragments thereof, and b) detecting the presence or absence of said hybridization complex, and, optionally, if present, the amount thereof.
 15. A method of claim 14, wherein the probe comprises at least 60 contiguous nucleotides.
 16. A method of detecting a target polynucleotide in a sample, said target polynucleotide having a sequence of a polynucleotide of claim 12, the method comprising: a) amplifying said target polynucleotide or fragment thereof using polymerase chain reaction amplification, and b) detecting the presence or absence of said amplified target polynucleotide or fragment thereof, and, optionally, if present, the amount thereof.
 17. A composition comprising a polypeptide of claim 1 and a pharmaceutically acceptable excipient.
 18. A composition of claim 17, wherein the polypeptide comprises an amino acid sequence of SEQ ID NO:2.
 19. A method for treating a disease or condition associated with decreased expression of functional LMTF, comprising administering to a patient in need of such treatment the composition of claim
 17. 20. A method of screening a compound for effectiveness as an agonist of a polypeptide of claim 1, the method comprising: a) exposing a sample comprising a polypeptide of claim 1 to a compound, and b) detecting agonist activity in the sample.
 21. A composition comprising an agonist compound identified by a method of claim 20 and a pharmaceutically acceptable excipient.
 22. A method for treating a disease or condition associated with decreased expression of functional LMTF, comprising administering to a patient in need of such treatment a composition of claim
 21. 23. A method of screening a compound for effectiveness as an antagonist of a polypeptide of claim 1, the method comprising: a) exposing a sample comprising a polypeptide of claim 1 to a compound, and b) detecting antagonist activity in the sample.
 24. A composition comprising an antagonist compound identified by a method of claim 23 and a pharmaceutically acceptable excipient.
 25. A method for treating a disease or condition associated with overexpression of functional LMTF, comprising administering to a patient in need of such treatment a composition of claim
 24. 26. A method of screening for a compound that specifically binds to the polypeptide of claim 1, the method comprising: a) combining the polypeptide of claim 1 with at least one test compound under suitable conditions, and b) detecting binding of the polypeptide of claim 1 to the test compound, thereby identifying a compound that specifically binds to the polypeptide of claim
 1. 27. A method of screening for a compound that modulates the activity of the polypeptide of claim 1, said method comprising: a) combining the polypeptide of claim 1 with at least one test compound under conditions permissive for the activity of the polypeptide of claim 1, b) assessing the activity of the polypeptide of claim 1 in the presence of the test compound, and c) comparing the activity of the polypeptide of claim 1 in the presence of the test compound with the activity of the polypeptide of claim 1 in the absence of the test compound, wherein a change in the activity of the polypeptide of claim 1 in the presence of the test compound is indicative of a compound that modulates the activity of the polypeptide of claim
 1. 28. A method of screening a compound for effectiveness in altering expression of a target polynucleotide, wherein said target polynucleotide comprises a polynucleotide sequence of claim 5, the method comprising: a) exposing a sample comprising the target polynucleotide to a compound, under conditions suitable for the expression of the target polynucleotide, b) detecting altered expression of the target polynucleotide, and c) comparing the expression of the target polynucleotide in the presence of varying amounts of the compound and in the absence of the compound.
 29. A method of assessing toxicity of a test compound, the method comprising: a) treating a biological sample containing nucleic acids with the test compound, b) hybridizing the nucleic acids of the treated biological sample with a probe comprising at least 20 contiguous nucleotides of a polynucleotide of claim 12 under conditions whereby a specific hybridization complex is formed between said probe and a target polynucleotide in the biological sample, said target polynucleotide comprising a polynucleotide sequence of a polynucleotide of claim 12 or fragment thereof, c) quantifying the amount of hybridization complex, and d) comparing the amount of hybridization complex in the treated biological sample with the amount of hybridization complex in an untreated biological sample, wherein a difference in the amount of hybridization complex in the treated biological sample is indicative of toxicity of the test compound.
 30. A diagnostic test for a condition or disease associated with the expression of LMTF in a biological sample, the method comprising: a) combining the biological sample with an antibody of claim 11, under conditions suitable for the antibody to bind the polypeptide and form an antibody:polypeptide complex, and b) detecting the complex, wherein the presence of the complex correlates with the presence of the polypeptide in the biological sample.
 31. The antibody of claim 11, wherein the antibody is: a) a chimeric antibody, b) a single chain antibody, c) a Fab fragment, d) a F(ab′)₂ fragment, or e) a humanized antibody.
 32. A composition comprising an antibody of claim 11 and an acceptable excipient.
 33. A method of diagnosing a condition or disease associated with the expression of LMTF in a subject, comprising administering to said subject an effective amount of the composition of claim
 32. 34. A composition of claim 32, wherein the antibody is labeled.
 35. A method of diagnosing a condition or disease associated with the expression of LMTF in a subject, comprising administering to said subject all effective amount of the composition of claim
 34. 36. A method of preparing a polyclonal antibody with the specificity of the antibody of claim 11, the method comprising: a) immunizing an animal with a polypeptide consisting of an amino acid sequence of SEQ ID NO:2, or an immunogenic fragment thereof, under conditions to elicit an antibody response, b) isolating antibodies from said animal, and c) screening the isolated antibodies with the polypeptide, thereby identifying a polyclonal antibody which binds specifically to a polypeptide comprising an amino acid sequence of SEQ ID NO:2.
 37. A polyclonal antibody produced by a method of claim
 36. 38. A composition comprising the polyclonal antibody of claim 37 and a suitable carrier.
 39. A method of making a monoclonal antibody with the specificity of the antibody of claim 11, the method comprising: a) immunizing an animal with a polypeptide consisting of an amino acid sequence of SEQ ID NO:2, or an immunogenic fragment thereof, under conditions to elicit an antibody response, b) isolating antibody producing cells from the animal, c) fusing the antibody producing cells with immortalized cells to form monoclonal antibody-producing hybridoma cells, d) culturing the hybridoma cells, and e) isolating from the culture monoclonal antibody which binds specifically to a polypeptide comprising an amino acid sequence of SEQ ID NO:2.
 40. A monoclonal antibody produced by a method of claim
 39. 41. A composition comprising the monoclonal antibody of claim 40 and a suitable carrier.
 42. The antibody of claim 11, wherein the monoclonal antibody is produced by screening a Fab expression library.
 43. The antibody of claim 11, wherein the antibody is produced by screening a recombinant immunoglobulin library.
 44. A method of detecting a polypeptide comprising an amino acid sequence of SEQ ID NO:2 in a sample, the method comprising: a) incubating the antibody of claim 11 with a sample under conditions to allow specific binding of the antibody and the polypeptide, and b) detecting specific binding, wherein specific binding indicates the presence of a polypeptide comprising an amino acid sequence of SEQ ID NO:2 in the sample.
 45. A method of purifying a polypeptide comprising an amino acid sequence of SEQ ID NO:2 from a sample, the method comprising: a) incubating the antibody of claim 11 with a sample under conditions to allow specific binding of the antibody and the polypeptide, and b) separating the antibody from the sample and obtaining the purified polypeptide comprising an amino acid sequence of SEQ ID NO:2.
 46. A microarray wherein at least one element of the microarray is a polynucleotide of claim
 12. 47. A method of generating an expression profile of a sample which contains polynucleotides, the method comprising: a) labeling the polynucleotides of the sample, b) contacting the elements of the microarray of claim 46 with the labeled polynucleotides of the sample under conditions suitable for the formation of a hybridization complex, and c) quantifying the expression of the polynucleotides in the sample.
 48. An array comprising different nucleotide molecules affixed in distinct physical locations on a solid substrate, wherein at least one of said nucleotide molecules comprises a first oligonucleotide or polynucleotide sequence specifically hybridizable with at least 30 contiguous nucleotides of a target polynucleotide, and wherein said target polynucleotide is a polynucleotide of claim
 12. 49. An array of claim 48, wherein said first oligonucleotide or polynucleotide sequence is completely complementary to at least 30 contiguous nucleotides of said target polynucleotide.
 50. An array of claim 48, wherein said first oligonucleotide or polynucleotide sequence is completely complementary to at least 60 contiguous nucleotides of said target polynucleotide.
 51. An array of claim 48, wherein said first oligonucleotide or polynucleotide sequence is completely complementary to said target polynucleotide.
 52. An array of claim 48, which is a microarray.
 53. An array of claim 48, further comprising said target polynucleotide hybridized to a nucleotide molecule comprising said first oligonucleotide or polynucleotide sequence.
 54. An array of claim 48, wherein a linker joins at least one of said nucleotide molecules to said solid substrate.
 55. An array of claim 48, wherein each distinct physical location on the substrate contains multiple nucleotide molecules, and the multiple nucleotide molecules at any single distinct physical location have the same sequence, and each distinct physical location on the substrate contains nucleotide molecules having a sequence which differs from the sequence of nucleotide molecules at another distinct physical location on the substrate. 